TY - JOUR
T1 - Proposal-based visual tracking using spatial cascaded transformed region proposal network
AU - Zhang, Ximing
AU - Luo, Shujuan
AU - Fan, Xuewu
N1 - KAUST Repository Item: Exported on 2022-06-14
Acknowledgements: This research was funded by and Ministry of National Defense of China, grant number GFZX04014307 respectively. Thanks to the experimental data provided by University of Ljubljana, SICK, Hiar, King Abdullah University of Science and Technology. Thanks to the experimental facilities provided by Xi’an Institute of Optics and Precision Mechanics of CAS.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2020/8/26
Y1 - 2020/8/26
N2 - Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the best of the features from different convolutional layers, and the original loss function cannot alleviate the data imbalance issue of the training procedure. We propose the Spatial Cascaded Transformed RPN to combine the RPN and STN (spatial transformer network) together, in order to successfully obtain the proposals of high quality, which can simultaneously improves the robustness. The STN can transfer the spatial transformed features though different stages, which extends the spatial representation capability of such networks handling complex scenarios such as scale variation and affine transformation. We break the restriction though an easy samples penalization loss (shrinkage loss) instead of smooth L1 function. Moreover, we perform the multi-cue proposals re-ranking to guarantee the accuracy of the proposed tracker. We extensively prove the effectiveness of our proposed method on the ablation studies of the tracking datasets, which include OTB-2015 (Object Tracking Benchmark 2015), VOT-2018 (Visual Object Tracking 2018), LaSOT (Large Scale Single Object Tracking), TrackingNet (A Large-Scale Dataset and Benchmark for Object Tracking in the Wild) and UAV123 (UAV Tracking Dataset).
AB - Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the best of the features from different convolutional layers, and the original loss function cannot alleviate the data imbalance issue of the training procedure. We propose the Spatial Cascaded Transformed RPN to combine the RPN and STN (spatial transformer network) together, in order to successfully obtain the proposals of high quality, which can simultaneously improves the robustness. The STN can transfer the spatial transformed features though different stages, which extends the spatial representation capability of such networks handling complex scenarios such as scale variation and affine transformation. We break the restriction though an easy samples penalization loss (shrinkage loss) instead of smooth L1 function. Moreover, we perform the multi-cue proposals re-ranking to guarantee the accuracy of the proposed tracker. We extensively prove the effectiveness of our proposed method on the ablation studies of the tracking datasets, which include OTB-2015 (Object Tracking Benchmark 2015), VOT-2018 (Visual Object Tracking 2018), LaSOT (Large Scale Single Object Tracking), TrackingNet (A Large-Scale Dataset and Benchmark for Object Tracking in the Wild) and UAV123 (UAV Tracking Dataset).
UR - http://hdl.handle.net/10754/678983
UR - https://www.mdpi.com/1424-8220/20/17/4810
UR - http://www.scopus.com/inward/record.url?scp=85089845689&partnerID=8YFLogxK
U2 - 10.3390/s20174810
DO - 10.3390/s20174810
M3 - Article
C2 - 32858907
SN - 1424-8220
VL - 20
SP - 1
EP - 20
JO - Sensors (Switzerland)
JF - Sensors (Switzerland)
IS - 17
ER -