TY - GEN
T1 - Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos
AU - Heilbron, Fabian Caba
AU - Niebles, Juan Carlos
AU - Ghanem, Bernard
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: Research in this publication was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research, the Stanford AI Lab-Toyota Center for Artificial Intelligence Research, and a Google Faculty Research Award (2015).
PY - 2016/12/13
Y1 - 2016/12/13
N2 - In many large-scale video analysis scenarios, one is interested in localizing and recognizing human activities that occur in short temporal intervals within long untrimmed videos. Current approaches for activity detection still struggle to handle large-scale video collections and the task remains relatively unexplored. This is in part due to the computational complexity of current action recognition approaches and the lack of a method that proposes fewer intervals in the video, where activity processing can be focused. In this paper, we introduce a proposal method that aims to recover temporal segments containing actions in untrimmed videos. Building on techniques for learning sparse dictionaries, we introduce a learning framework to represent and retrieve activity proposals. We demonstrate the capabilities of our method in not only producing high quality proposals but also in its efficiency. Finally, we show the positive impact our method has on recognition performance when it is used for action detection, while running at 10FPS.
AB - In many large-scale video analysis scenarios, one is interested in localizing and recognizing human activities that occur in short temporal intervals within long untrimmed videos. Current approaches for activity detection still struggle to handle large-scale video collections and the task remains relatively unexplored. This is in part due to the computational complexity of current action recognition approaches and the lack of a method that proposes fewer intervals in the video, where activity processing can be focused. In this paper, we introduce a proposal method that aims to recover temporal segments containing actions in untrimmed videos. Building on techniques for learning sparse dictionaries, we introduce a learning framework to represent and retrieve activity proposals. We demonstrate the capabilities of our method in not only producing high quality proposals but also in its efficiency. Finally, we show the positive impact our method has on recognition performance when it is used for action detection, while running at 10FPS.
UR - http://hdl.handle.net/10754/622892
UR - http://ieeexplore.ieee.org/document/7780580/
U2 - 10.1109/CVPR.2016.211
DO - 10.1109/CVPR.2016.211
M3 - Conference contribution
SN - 9781467388511
BT - 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PB - Institute of Electrical and Electronics Engineers (IEEE)
ER -