TY - JOUR
T1 - MetastaSite
T2 - Predicting metastasis to different sites using deep learning with gene expression data
AU - Albaradei, Somayah
AU - Albaradei, Abdurhman
AU - Alsaedi, Asim
AU - Uludag, Mahmut
AU - Thafar, Maha A.
AU - Gojobori, Takashi
AU - Essack, Magbubah
AU - Gao, Xin
N1 - Funding Information:
The research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST) through grant awards Nos. BAS/1/1059-01-01, BAS/1/1624-01-01, FCC/1/1976-20-01, FCC/1/1976-26-01, URF/1/3450-01-01, REI/1/4216-01-01, REI/1/4437-01-01, REI/1/4473-01-01, and URF/1/4098-01-01.
Publisher Copyright:
Copyright © 2022 Albaradei, Albaradei, Alsaedi, Uludag, Thafar, Gojobori, Essack and Gao.
PY - 2022/7/22
Y1 - 2022/7/22
N2 - Deep learning has massive potential in predicting phenotype from different omics profiles. However, deep neural networks are viewed as black boxes, providing predictions without explanation. Therefore, the requirements for these models to become interpretable are increasing, especially in the medical field. Here we propose a computational framework that takes the gene expression profile of any primary cancer sample and predicts whether patients’ samples are primary (localized) or metastasized to the brain, bone, lung, or liver based on deep learning architecture. Specifically, we first constructed an AutoEncoder framework to learn the non-linear relationship between genes, and then DeepLIFT was applied to calculate genes’ importance scores. Next, to mine the top essential genes that can distinguish the primary and metastasized tumors, we iteratively added ten top-ranked genes based upon their importance score to train a DNN model. Then we trained a final multi-class DNN that uses the output from the previous part as an input and predicts whether samples are primary or metastasized to the brain, bone, lung, or liver. The prediction performances ranged from AUC of 0.93–0.82. We further designed the model’s workflow to provide a second functionality beyond metastasis site prediction, i.e., to identify the biological functions that the DL model uses to perform the prediction. To our knowledge, this is the first multi-class DNN model developed for the generic prediction of metastasis to various sites.
AB - Deep learning has massive potential in predicting phenotype from different omics profiles. However, deep neural networks are viewed as black boxes, providing predictions without explanation. Therefore, the requirements for these models to become interpretable are increasing, especially in the medical field. Here we propose a computational framework that takes the gene expression profile of any primary cancer sample and predicts whether patients’ samples are primary (localized) or metastasized to the brain, bone, lung, or liver based on deep learning architecture. Specifically, we first constructed an AutoEncoder framework to learn the non-linear relationship between genes, and then DeepLIFT was applied to calculate genes’ importance scores. Next, to mine the top essential genes that can distinguish the primary and metastasized tumors, we iteratively added ten top-ranked genes based upon their importance score to train a DNN model. Then we trained a final multi-class DNN that uses the output from the previous part as an input and predicts whether samples are primary or metastasized to the brain, bone, lung, or liver. The prediction performances ranged from AUC of 0.93–0.82. We further designed the model’s workflow to provide a second functionality beyond metastasis site prediction, i.e., to identify the biological functions that the DL model uses to perform the prediction. To our knowledge, this is the first multi-class DNN model developed for the generic prediction of metastasis to various sites.
KW - artificial intelligence
KW - clinical decision-making
KW - deep learning
KW - gene expression
KW - machine learning
KW - metastasis
KW - metastasis site
UR - http://www.scopus.com/inward/record.url?scp=85135449419&partnerID=8YFLogxK
U2 - 10.3389/fmolb.2022.913602
DO - 10.3389/fmolb.2022.913602
M3 - Article
C2 - 35936793
AN - SCOPUS:85135449419
SN - 2296-889X
VL - 9
JO - Frontiers in Molecular Biosciences
JF - Frontiers in Molecular Biosciences
M1 - 913602
ER -