TY - JOUR
T1 - Noise-robust Deep Cross-Modal Hashing
AU - Wang, Runmin
AU - Yu, Guoxian
AU - Zhang, Hong
AU - Guo, Maozu
AU - Cui, Lizhen
AU - Zhang, Xiangliang
N1 - KAUST Repository Item: Exported on 2021-11-20
Acknowledgements: This work was supported by the Natural Science Foundation of China (61872300 and 62031003).
PY - 2021/9/14
Y1 - 2021/9/14
N2 - Cross-modal hashing has been intensively studied to efficiently retrieve multi-modal data across modalities. Supervised cross-modal hashing methods leverage the labels of training data to improve the retrieval performance. However, most of these methods still assume that the semantic labels of training data are ideally complete and noise-free. This assumption is too optimistic for real multi-modal data, whose label annotations are, in essence, error-prone. To achieve effective cross-modal hashing on multi-modal data with noisy labels, we introduce an end-to-end solution called Noise-robust Deep Cross-modal Hashing (NrDCMH). NrDCMH contains two main components: a noise instance detection module and a hash code learning module. In the noise detection module, NrDCMH firstly detects noisy training instance pairs based on the margin between the label similarity and feature similarity, and specifies weights to pairs using the margin. In the hash learning module, NrDCMH incorporates the weights into a likelihood loss function to reduce the impact of instances with noisy labels and to learn compatible deep features by applying different neural networks on multi-modality data in a unified end-to-end framework. Experimental results on multi-modal benchmark datasets demonstrate that NrDCMH performs significantly better than competitive methods with noisy label annotations. NrDCMH also achieves competitive results in ‘noise-free’ scenarios.
AB - Cross-modal hashing has been intensively studied to efficiently retrieve multi-modal data across modalities. Supervised cross-modal hashing methods leverage the labels of training data to improve the retrieval performance. However, most of these methods still assume that the semantic labels of training data are ideally complete and noise-free. This assumption is too optimistic for real multi-modal data, whose label annotations are, in essence, error-prone. To achieve effective cross-modal hashing on multi-modal data with noisy labels, we introduce an end-to-end solution called Noise-robust Deep Cross-modal Hashing (NrDCMH). NrDCMH contains two main components: a noise instance detection module and a hash code learning module. In the noise detection module, NrDCMH firstly detects noisy training instance pairs based on the margin between the label similarity and feature similarity, and specifies weights to pairs using the margin. In the hash learning module, NrDCMH incorporates the weights into a likelihood loss function to reduce the impact of instances with noisy labels and to learn compatible deep features by applying different neural networks on multi-modality data in a unified end-to-end framework. Experimental results on multi-modal benchmark datasets demonstrate that NrDCMH performs significantly better than competitive methods with noisy label annotations. NrDCMH also achieves competitive results in ‘noise-free’ scenarios.
UR - http://hdl.handle.net/10754/672037
UR - https://linkinghub.elsevier.com/retrieve/pii/S0020025521009610
UR - http://www.scopus.com/inward/record.url?scp=85115427402&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2021.09.030
DO - 10.1016/j.ins.2021.09.030
M3 - Article
SN - 0020-0255
VL - 581
SP - 136
EP - 154
JO - Information Sciences
JF - Information Sciences
ER -