TY - GEN
T1 - Partial multi-label learning using label compression
AU - Yu, Tingting
AU - Yu, Guoxian
AU - Wang, Jun
AU - Domeniconi, Carlotta
AU - Zhang, Xiangliang
N1 - KAUST Repository Item: Exported on 2021-03-01
PY - 2020/11
Y1 - 2020/11
N2 - Partial multi-label learning (PML) aims at learning a robust multi-label classifier from partial multi-label data, where a sample is annotated with a set of candidate labels, while only a subset of those labels is valid. The existing PML algorithms generally suffer from the high computational cost when learning with large label spaces. In this paper, we introduce a PML approach (PML-LCom) that uses Label Compression to efficiently learn from partial multi-label data. PML-LCom firstly splits the observed label data matrix into a latent relevant label matrix and an irrelevant one, and then factorizes the relevant label matrix into two low-rank matrices, one encodes the compressed labels of samples, and the other explores the underlying label correlations. Next, it optimizes the coefficient matrix of the multi-label predictor with respect to the compressed label matrix. In addition, it regularizes the compressed label matrix with respect to the feature similarity of samples, and optimizes the label matrix and predictor in a coherent manner. Experimental results on both semi-synthetic and real-world PML datasets show that PML-LCom achieves a performance superior to the state-of-the-art solutions on predicting the labels of unlabeled samples with a large label space. The label compression improves both the effectiveness and efficiency, and the coherent optimization mutually benefits the label matrix and predictor.
AB - Partial multi-label learning (PML) aims at learning a robust multi-label classifier from partial multi-label data, where a sample is annotated with a set of candidate labels, while only a subset of those labels is valid. The existing PML algorithms generally suffer from the high computational cost when learning with large label spaces. In this paper, we introduce a PML approach (PML-LCom) that uses Label Compression to efficiently learn from partial multi-label data. PML-LCom firstly splits the observed label data matrix into a latent relevant label matrix and an irrelevant one, and then factorizes the relevant label matrix into two low-rank matrices, one encodes the compressed labels of samples, and the other explores the underlying label correlations. Next, it optimizes the coefficient matrix of the multi-label predictor with respect to the compressed label matrix. In addition, it regularizes the compressed label matrix with respect to the feature similarity of samples, and optimizes the label matrix and predictor in a coherent manner. Experimental results on both semi-synthetic and real-world PML datasets show that PML-LCom achieves a performance superior to the state-of-the-art solutions on predicting the labels of unlabeled samples with a large label space. The label compression improves both the effectiveness and efficiency, and the coherent optimization mutually benefits the label matrix and predictor.
UR - http://hdl.handle.net/10754/667718
UR - https://ieeexplore.ieee.org/document/9338400/
UR - http://www.scopus.com/inward/record.url?scp=85100875019&partnerID=8YFLogxK
U2 - 10.1109/ICDM50108.2020.00085
DO - 10.1109/ICDM50108.2020.00085
M3 - Conference contribution
SN - 9781728183169
SP - 761
EP - 770
BT - 2020 IEEE International Conference on Data Mining (ICDM)
PB - IEEE
ER -