TY - GEN
T1 - Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding
AU - Liu, Yi-Chieh
AU - Hsieh, Yung-An
AU - Chen, Min-Hung
AU - Yang, C.-H. Huck
AU - Tegner, Jesper
AU - Tsai, Y.-C. James
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2020
Y1 - 2020
N2 - Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models’ performance visually. By examining the video attention saliency, we found that existing models could not precisely capture the causes (e.g., traffic light) of the specific action (e.g., stopping). Therefore, the Temporal Reasoning Block (TRB) was proposed and introduced to the models. With the TRB models, we achieved the accuracy of 86.3%, which outperform the state-of-the-art 3D CNNs from previous works. The attention saliency also demonstrated that TRB helped models focus on the causes more precisely. With both numerical and visual evaluations, we concluded that our proposed TRB models were able to provide accurate driving behavior prediction by learning the causal reasoning of the behaviors.
AB - Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models’ performance visually. By examining the video attention saliency, we found that existing models could not precisely capture the causes (e.g., traffic light) of the specific action (e.g., stopping). Therefore, the Temporal Reasoning Block (TRB) was proposed and introduced to the models. With the TRB models, we achieved the accuracy of 86.3%, which outperform the state-of-the-art 3D CNNs from previous works. The attention saliency also demonstrated that TRB helped models focus on the causes more precisely. With both numerical and visual evaluations, we concluded that our proposed TRB models were able to provide accurate driving behavior prediction by learning the causal reasoning of the behaviors.
UR - http://hdl.handle.net/10754/660688
UR - https://ieeexplore.ieee.org/document/9053783/
U2 - 10.1109/ICASSP40776.2020.9053783
DO - 10.1109/ICASSP40776.2020.9053783
M3 - Conference contribution
SN - 978-1-5090-6632-2
BT - ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PB - IEEE
ER -