TY - GEN
T1 - Distractor-Aware Video Object Segmentation
AU - Robinson, Andreas
AU - Eldesokey, Abdelrahman
AU - Felsberg, Michael
N1 - Funding Information:
Acknowledgements. This project was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation, the Excellence Center at Linköping-Lund in Information Technology (ELLIIT), the Swedish Research Council grant no. 2018-04673, and the Swedish Foundation for Strategic Research (SSF) project Symbicloud.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Semi-supervised video object segmentation is a challenging task that aims to segment a target throughout a video sequence given an initial mask at the first frame. Discriminative approaches have demonstrated competitive performance on this task at a sensible complexity. These approaches typically formulate the problem as a one-versus-one classification between the target and the background. However, in reality, a video sequence usually encompasses a target, background, and possibly other distracting objects. Those objects increase the risk of introducing false positives, especially if they share visual similarities with the target. Therefore, it is more effective to separate distractors from the background, and handle them independently. We propose a one-versus-many scheme to address this situation by separating distractors into their own class. This separation allows imposing special attention to challenging regions that are most likely to degrade the performance. We demonstrate the prominence of this formulation by modifying the learning-what-to-learn [3] method to be distractor-aware. Our proposed approach sets a new state-of-the-art on the DAVIS 2017 validation dataset, and improves over the baseline on the DAVIS 2017 test-dev benchmark by 4.6% points.
AB - Semi-supervised video object segmentation is a challenging task that aims to segment a target throughout a video sequence given an initial mask at the first frame. Discriminative approaches have demonstrated competitive performance on this task at a sensible complexity. These approaches typically formulate the problem as a one-versus-one classification between the target and the background. However, in reality, a video sequence usually encompasses a target, background, and possibly other distracting objects. Those objects increase the risk of introducing false positives, especially if they share visual similarities with the target. Therefore, it is more effective to separate distractors from the background, and handle them independently. We propose a one-versus-many scheme to address this situation by separating distractors into their own class. This separation allows imposing special attention to challenging regions that are most likely to degrade the performance. We demonstrate the prominence of this formulation by modifying the learning-what-to-learn [3] method to be distractor-aware. Our proposed approach sets a new state-of-the-art on the DAVIS 2017 validation dataset, and improves over the baseline on the DAVIS 2017 test-dev benchmark by 4.6% points.
UR - http://www.scopus.com/inward/record.url?scp=85124271728&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-92659-5_14
DO - 10.1007/978-3-030-92659-5_14
M3 - Conference contribution
AN - SCOPUS:85124271728
SN - 9783030926588
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 222
EP - 234
BT - Pattern Recognition - 43rd DAGM German Conference, DAGM GCPR 2021, Proceedings
A2 - Bauckhage, Christian
A2 - Gall, Juergen
A2 - Schwing, Alexander
PB - Springer Science and Business Media Deutschland GmbH
T2 - 43rd DAGM German Conference on Pattern Recognition, DAGM GCPR 2021
Y2 - 28 September 2021 through 1 October 2021
ER -