TY - GEN
T1 - GCN-MF: Disease-gene association identification by graph convolutional networks and matrix factorization
AU - Han, Peng
AU - Shang, Shuo
AU - Yang, Peng
AU - Liu, Yong
AU - Zhao, Peilin
AU - Zhou, Jiayu
AU - Gao, Xin
AU - Kalnis, Panos
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2019/7/26
Y1 - 2019/7/26
N2 - Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover pathogenic mechanism of syndromes. With various clinical biomarkers measuring the similarities among genes and disease phenotypes, network-based semi-supervised learning (NSSL) has been commonly utilized by these studies to address this class-imbalanced large-scale data issue. However, most existing NSSL approaches are based on linear models and suffer from two major limitations: 1) They implicitly consider a local-structure representation for each candidate; 2) They are unable to capture nonlinear associations between diseases and genes. In this paper, we propose a new framework for disease-gene association task by combining Graph Convolutional Network (GCN) and matrix factorization, named GCN-MF. With the help of GCN, we could capture nonlinear interactions and exploit measured similarities. Moreover, we define a margin control loss function to reduce the effect of sparsity. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics.
AB - Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover pathogenic mechanism of syndromes. With various clinical biomarkers measuring the similarities among genes and disease phenotypes, network-based semi-supervised learning (NSSL) has been commonly utilized by these studies to address this class-imbalanced large-scale data issue. However, most existing NSSL approaches are based on linear models and suffer from two major limitations: 1) They implicitly consider a local-structure representation for each candidate; 2) They are unable to capture nonlinear associations between diseases and genes. In this paper, we propose a new framework for disease-gene association task by combining Graph Convolutional Network (GCN) and matrix factorization, named GCN-MF. With the help of GCN, we could capture nonlinear interactions and exploit measured similarities. Moreover, we define a margin control loss function to reduce the effect of sparsity. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics.
UR - http://hdl.handle.net/10754/656764
UR - http://dl.acm.org/citation.cfm?doid=3292500.3330912
UR - http://www.scopus.com/inward/record.url?scp=85071172087&partnerID=8YFLogxK
U2 - 10.1145/3292500.3330912
DO - 10.1145/3292500.3330912
M3 - Conference contribution
SN - 9781450362016
SP - 705
EP - 713
BT - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD '19
PB - Association for Computing [email protected]
ER -