TAB: Temporally aggregated bag-of-discriminant-words for temporal action proposals Articles uri icon

authors

  • Murtaza, Fiza
  • Yousaf, Muhammad Haroon
  • VELASTIN CARROZA, SERGIO ALEJANDRO

publication date

  • June 2019

start page

  • 42

end page

  • 52

volume

  • 183

International Standard Serial Number (ISSN)

  • 1077-3142

Electronic International Standard Serial Number (EISSN)

  • 1090-235X

abstract

  • In this work, we propose a new method to generate temporal action proposals from long untrimmed videos named Temporally Aggregated Bag-of-Discriminant-Words (TAB). TAB is based on the observation that there are many overlapping frames in action and background temporal regions of untrimmed videos, which cause difficulties in segmenting actions from non-action regions. TAB solves this issue by extracting class-specific codewords from the action and background videos and extracting the discriminative weights of these codewords based on their ability to discriminate between these two classes. We integrate these discriminative weights with Bag of Word encoding, which we then call Bag-of-Discriminant-Words (BoDW). We sample the untrimmed videos into non-overlapping snippets and temporally aggregate the BoDW representation of multiple snippets into action proposals using a binary classifier trained on trimmed videos in a single pass. TAB can be used with different types of features, including those computed by deep networks. We present the effectiveness of the TAB proposal extraction method on two challenging temporal action detection datasets: MSR-II and Thumos14, where it improves upon state-of-the-art with recall rates of 87.0% and 82.0% respectively at a temporal intersection over union ratio of 0.8.

subjects

  • Computer Science

keywords

  • temporal action detection; bag of words; temporal action proposals