TY - JOUR
T1 - Beyond Weakly-supervised: Pseudo Ground Truths Mining for Missing Bounding-boxes Object Detection
AU - Zhang, Yongqiang
AU - Ding, Mingli
AU - Bai, Yancheng
AU - Xu, Mengmeng
AU - Ghanem, Bernard
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2019
Y1 - 2019
N2 - Due to the shortcomings of the weakly-supervised and fully-supervised object detection (i.e. unsatisfactory performance and expensive annotations, respectively), leveraging partially labeled images in a cost-effective way to train an object detector has attracted much attention. In this paper, we formulate this challenging task as a missing bounding-boxes object detection problem. Specifically, we develop a pseudo ground truth mining (PGTM) procedure to automatically find the missing bounding-boxes for the unlabeled instances, called pseudo ground truths here, in the training data, and then combine the mined pseudo ground truths and the labeled annotations to train a fully-supervised object detector. Furthermore, we further propose an incremental learning (IL) framework to gradually incorporate the results of the trained fully-supervised detector to improve the performance of missing bounding-boxes object detection. More importantly, we find an effective way to label the massive images with limited labors and funds, which is crucial when building a large-scale weakly/webly labeled dataset for object detection. Extensive experiments on the PASCAL VOC and COCO benchmarks demonstrate that our proposed method can narrow the gap between fully-supervised and weakly-supervised object detectors, and we outperform the previous state-of-the-art weakly-supervised detectors by a large margin (more than 3% mAP absolutely) when the missing rate equals 0.9. Moreover, our proposed method with 30% missing bounding-box annotations can achieve comparable performance to some fully-supervised detectors.
AB - Due to the shortcomings of the weakly-supervised and fully-supervised object detection (i.e. unsatisfactory performance and expensive annotations, respectively), leveraging partially labeled images in a cost-effective way to train an object detector has attracted much attention. In this paper, we formulate this challenging task as a missing bounding-boxes object detection problem. Specifically, we develop a pseudo ground truth mining (PGTM) procedure to automatically find the missing bounding-boxes for the unlabeled instances, called pseudo ground truths here, in the training data, and then combine the mined pseudo ground truths and the labeled annotations to train a fully-supervised object detector. Furthermore, we further propose an incremental learning (IL) framework to gradually incorporate the results of the trained fully-supervised detector to improve the performance of missing bounding-boxes object detection. More importantly, we find an effective way to label the massive images with limited labors and funds, which is crucial when building a large-scale weakly/webly labeled dataset for object detection. Extensive experiments on the PASCAL VOC and COCO benchmarks demonstrate that our proposed method can narrow the gap between fully-supervised and weakly-supervised object detectors, and we outperform the previous state-of-the-art weakly-supervised detectors by a large margin (more than 3% mAP absolutely) when the missing rate equals 0.9. Moreover, our proposed method with 30% missing bounding-box annotations can achieve comparable performance to some fully-supervised detectors.
UR - http://hdl.handle.net/10754/655921
UR - https://ieeexplore.ieee.org/document/8638807/
UR - http://www.scopus.com/inward/record.url?scp=85083035116&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2019.2898559
DO - 10.1109/TCSVT.2019.2898559
M3 - Article
SN - 1051-8215
VL - 30
SP - 1
EP - 1
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 4
ER -