TY - JOUR
T1 - End-to-end interpretable disease–gene association prediction
AU - Li, Yang
AU - Guo, Zihou
AU - Wang, Keqi
AU - Gao, Xin
AU - Wang, Guohua
N1 - KAUST Repository Item: Exported on 2023-04-03
Acknowledgements: This work was supported by the National Natural Science Foundation of China (62276059, 62072095, 62225109) and the Heilongjiang Postdoctoral Science Foundation (LBH-Z20104). We thank the anonymous reviewers for their constructive comments.
PY - 2023/3/28
Y1 - 2023/3/28
N2 - Identifying disease–gene associations is a fundamental and critical biomedical task towards understanding molecular mechanisms, the diagnosis and treatment of diseases. It is time-consuming and expensive to experimentally verify causal links between diseases and genes. Recently, deep learning methods have achieved tremendous success in identifying candidate genes for genetic diseases. The gene prediction problem can be modeled as a link prediction problem based on the features of nodes and edges of the gene–disease graph. However, most existing researches either build homogeneous networks based on one single data source or heterogeneous networks based on multi-source data, and artificially define meta-paths, so as to learn the network representation of diseases and genes. The former cannot make use of abundant multi-source heterogeneous information, while the latter needs domain knowledge and experience when defining meta-paths, and the accuracy of the model largely depends on the definition of meta-paths. To address the aforementioned challenges above bottlenecks, we propose an end-to-end disease–gene association prediction model with parallel graph transformer network (DGP-PGTN), which deeply integrates the heterogeneous information of diseases, genes, ontologies and phenotypes. DGP-PGTN can automatically and comprehensively capture the multiple latent interactions between diseases and genes, discover the causal relationship between them and is fully interpretable at the same time. We conduct comprehensive experiments and show that DGP-PGTN outperforms the state-of-the-art methods significantly on the task of disease–gene association prediction. Furthermore, DGP-PGTN can automatically learn the implicit relationship between diseases and genes without manually defining meta paths.
AB - Identifying disease–gene associations is a fundamental and critical biomedical task towards understanding molecular mechanisms, the diagnosis and treatment of diseases. It is time-consuming and expensive to experimentally verify causal links between diseases and genes. Recently, deep learning methods have achieved tremendous success in identifying candidate genes for genetic diseases. The gene prediction problem can be modeled as a link prediction problem based on the features of nodes and edges of the gene–disease graph. However, most existing researches either build homogeneous networks based on one single data source or heterogeneous networks based on multi-source data, and artificially define meta-paths, so as to learn the network representation of diseases and genes. The former cannot make use of abundant multi-source heterogeneous information, while the latter needs domain knowledge and experience when defining meta-paths, and the accuracy of the model largely depends on the definition of meta-paths. To address the aforementioned challenges above bottlenecks, we propose an end-to-end disease–gene association prediction model with parallel graph transformer network (DGP-PGTN), which deeply integrates the heterogeneous information of diseases, genes, ontologies and phenotypes. DGP-PGTN can automatically and comprehensively capture the multiple latent interactions between diseases and genes, discover the causal relationship between them and is fully interpretable at the same time. We conduct comprehensive experiments and show that DGP-PGTN outperforms the state-of-the-art methods significantly on the task of disease–gene association prediction. Furthermore, DGP-PGTN can automatically learn the implicit relationship between diseases and genes without manually defining meta paths.
UR - http://hdl.handle.net/10754/690791
UR - https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbad118/7091476
U2 - 10.1093/bib/bbad118
DO - 10.1093/bib/bbad118
M3 - Article
C2 - 36987781
SN - 1467-5463
JO - Briefings in bioinformatics
JF - Briefings in bioinformatics
ER -