Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition Articles uri icon

authors

  • VELASTIN CARROZA, SERGIO ALEJANDRO
  • Nazir, Saima
  • YOUSAF, HAROON MUHAMMAD

publication date

  • November 2018

start page

  • 660

end page

  • 669

volume

  • 72

International Standard Serial Number (ISSN)

  • 0045-7906

Electronic International Standard Serial Number (EISSN)

  • 1879-0755

abstract

  • The detection of the spatial-temporal interest points has a key role in human action recognition algorithms. This research work aims to exploit the existing strength of bag-of-visual features and presents a method for automatic action recognition in realistic and complex scenarios. This paper provides a better feature representation by combining the benefit of both a well-known feature detector and descriptor i.e. the 3D Harris space-time interest point detector and the 3D Scale-Invariant Feature Transform descriptor. Finally, action videos are represented using a histogram of visual features by following the traditional bag-of-visual feature approach. Apart from video representation, a support vector machine (SVM) classifier is used for training and testing. A large number of experiments show the effectiveness of our method on existing benchmark datasets and shows state-of-the-art performance. This article reports 68.1% mean Average Precision (mAP), 94% and 91.8% average accuracy for Hollywood-2, UCF Sports and KTH datasets respectively. (C) 2018 Elsevier Ltd. All rights reserved.

subjects

  • Computer Science

keywords

  • human action recognition; local spatio-temporal features; bag-of-visual features; hollywood-2 dataset