ActivityNet: A large-scale video benchmark for human activity understanding

Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1862 Scopus citations

Abstract

In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize. This is in part due to the simplicity of current benchmarks, which mostly focus on simple actions and movements occurring on manually trimmed videos. In this paper we introduce ActivityNet, a new large-scale video benchmark for human activity understanding. Our benchmark aims at covering a wide range of complex human activities that are of interest to people in their daily living. In its current version, ActivityNet provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours. We illustrate three scenarios in which ActivityNet can be used to compare algorithms for human activity understanding: untrimmed video classification, trimmed activity classification and activity detection.

Original languageEnglish (US)
Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PublisherIEEE Computer Society
Pages961-970
Number of pages10
ISBN (Electronic)9781467369640
DOIs
StatePublished - Oct 14 2015
EventIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
Duration: Jun 7 2015Jun 12 2015

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume07-12-June-2015
ISSN (Print)1063-6919

Conference

ConferenceIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
Country/TerritoryUnited States
CityBoston
Period06/7/1506/12/15

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'ActivityNet: A large-scale video benchmark for human activity understanding'. Together they form a unique fingerprint.

Cite this