TY - GEN
T1 - End-to-end, single-stream temporal action detection in untrimmed videos
AU - Buch, Shyamal
AU - Escorcia, Victor
AU - Ghanem, Bernard
AU - Fei-Fei, Li
AU - Niebles, Juan Carlos
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2019/5/1
Y1 - 2019/5/1
N2 - In this work, we present a new intuitive, end-to-end approach for temporal action detection in untrimmed videos. We introduce our new architecture for Single-Stream Temporal Action Detection (SS-TAD), which effectively integrates joint action detection with its semantic sub-tasks in a single unifying end-to-end framework. We develop a method for training our deep recurrent architecture based on enforcing semantic constraints on intermediate modules that are gradually relaxed as learning progresses. We find that such a dynamic learning scheme enables SS-TAD to achieve higher overall detection performance, with fewer training epochs. By design, our single-pass network is very efficient and can operate at 701 frames per second, while simultaneously outperforming the state-of-the-art methods for temporal action detection on THUMOS’14.
AB - In this work, we present a new intuitive, end-to-end approach for temporal action detection in untrimmed videos. We introduce our new architecture for Single-Stream Temporal Action Detection (SS-TAD), which effectively integrates joint action detection with its semantic sub-tasks in a single unifying end-to-end framework. We develop a method for training our deep recurrent architecture based on enforcing semantic constraints on intermediate modules that are gradually relaxed as learning progresses. We find that such a dynamic learning scheme enables SS-TAD to achieve higher overall detection performance, with fewer training epochs. By design, our single-pass network is very efficient and can operate at 701 frames per second, while simultaneously outperforming the state-of-the-art methods for temporal action detection on THUMOS’14.
UR - http://hdl.handle.net/10754/663479
UR - http://www.bmva.org/bmvc/2017/papers/paper093/index.html
UR - http://www.scopus.com/inward/record.url?scp=85084013937&partnerID=8YFLogxK
U2 - 10.5244/c.31.93
DO - 10.5244/c.31.93
M3 - Conference contribution
SN - 190172560X
BT - Procedings of the British Machine Vision Conference 2017
PB - British Machine Vision Association
ER -