Temporal segmentation and keyframe selection methods for user-generated video search-based annotation Articles uri icon

publication date

  • January 2015

start page

  • 488

end page

  • 502

issue

  • 1

volume

  • 42

international standard serial number (ISSN)

  • 0957-4174

electronic international standard serial number (EISSN)

  • 1873-6793

abstract

  • In this paper we propose a temporal segmentation and a keyframe selection method for User-Generated Video (UGV). Since UGV is rarely structured in shots and usually user's interest are revealed through camera movements, a UGV temporal segmentation system has been proposed that generates a video partition based on a camera motion classification. Motion-related mid-level features have been suggested to feed a Hierarchical Hidden Markov Model (HHMM) that produces a user-meaningful UGV temporal segmentation. Moreover, a keyframe selection method has been proposed that picks a keyframe for fixed-content camera motion patterns such as zoom, still, or shake and a set of keyframes for varying-content translation patterns. The proposed video segmentation approach has been compared to a state-of-the-art algorithm, achieving 8% performance improvement in a segmentation-based evaluation. Furthermore, a complete search-based UGV annotation system has been developed to assess the influence of the proposed algorithms on an end-user task. To that purpose, two UGV datasets have been developed and made available online. Specifically, the relevance of the considered camera motion types has been analyzed for these two datasets, and some guidelines are given to achieve the desired performance-complexity tradeoff. The keyframe selection algorithm for varying-content translation patterns has also been assessed, revealing a notable contribution to the performance of the global UGV annotation system. Finally, it has been shown that the UGV segmentation algorithm also produces improved annotation results with respect to a fixed-rate keyframe selection baseline or a traditional method relying on frame-level visual features.

keywords

  • user generated video; video annotation; video temporal segmentation; camera motion analysis; keyframe selection; hidden markov model; shot boundary detection; retrieval; algorithm