TY - GEN
T1 - Multi-label learning with highly incomplete data via collaborative embedding
AU - Han, Yufei
AU - Shen, Yun
AU - Sun, Guolei
AU - Zhang, Xiangliang
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Tremendous efforts have been dedicated to improving the effectiveness of multi-label learning with incomplete label assignments. Most of the current techniques assume that the input features of data instances are complete. Nevertheless, the co-occurrence of highly incomplete features and weak label assignments is a challenging and widely perceived issue in real-world multi-label learning applications due to a number of practical reasons including incomplete data collection, moderate labels from annotators, etc. Existing multi-label learning algorithms are not directly applicable when the observed features are highly incomplete. In this work, we attack this problem by proposing a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. This approach provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework. It is designed to collaboratively recover feature and label information, and extract the predictive association between the feature profile and the multi-label tag of the same data instance. Substantial experiments on public benchmark datasets and real security event data validate that our proposed method can provide distinctively more accurate transductive and inductive classification than other state-of-the-art algorithms.
AB - Tremendous efforts have been dedicated to improving the effectiveness of multi-label learning with incomplete label assignments. Most of the current techniques assume that the input features of data instances are complete. Nevertheless, the co-occurrence of highly incomplete features and weak label assignments is a challenging and widely perceived issue in real-world multi-label learning applications due to a number of practical reasons including incomplete data collection, moderate labels from annotators, etc. Existing multi-label learning algorithms are not directly applicable when the observed features are highly incomplete. In this work, we attack this problem by proposing a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. This approach provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework. It is designed to collaboratively recover feature and label information, and extract the predictive association between the feature profile and the multi-label tag of the same data instance. Substantial experiments on public benchmark datasets and real security event data validate that our proposed method can provide distinctively more accurate transductive and inductive classification than other state-of-the-art algorithms.
KW - Highly incomplete feature
KW - Multi-label learning
KW - Weak labels
UR - http://www.scopus.com/inward/record.url?scp=85051540548&partnerID=8YFLogxK
U2 - 10.1145/3219819.3220038
DO - 10.1145/3219819.3220038
M3 - Conference contribution
AN - SCOPUS:85051540548
SN - 9781450355520
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1494
EP - 1503
BT - KDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
Y2 - 19 August 2018 through 23 August 2018
ER -