TY - JOUR
T1 - T-PAIR: Temporal Node-pair Embedding for Automatic Biomedical Hypothesis Generation
AU - Akujuobi, Uchenna Thankgod
AU - Spranger, Michael
AU - Palanniappan, Sucheendra K
AU - Zhang, Xiangliang
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work was supported and funded by Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), and National Science Foundation of China (NSFC No. 61828302). We would also like to thank Ayako Yachie from The Systems Biology institute for her help in validating and checking the list of hypotheses generated by our algorithm.
PY - 2020
Y1 - 2020
N2 - In this paper, we study an automatic hypothesis generation (HG) problem, which refers to the discovery of meaningful implicit connections between scientific terms, including but not limited to diseases, chemicals, drugs, and genes extracted from databases of biomedical publications. Most prior studies of this problem focused on the use of static information of terms and largely ignored the temporal dynamics of scientific term relations. Even when the dynamics were considered in a few recent studies, they learned the representations for the scientific terms, rather than focusing on the term-pair relations. Since the HG problem is to predict term-pair connections, it is not enough to know with whom the terms are connected; it is more important to know how the connections have been formed (in a dynamic process). We formulate this HG problem as a future connectivity prediction in a dynamic attributed graph. The key is to capture the temporal evolution of node-pair (term-pair) relations. We propose an inductive edge (node-pair) embedding method named T-PAIR, utilizing both the graphical structure and node attribute to encode the temporal node-pair relationship. We demonstrate the efficiency of the proposed model on three real-world datasets constructed from biomedical publications in the transductive and inductive settings.
AB - In this paper, we study an automatic hypothesis generation (HG) problem, which refers to the discovery of meaningful implicit connections between scientific terms, including but not limited to diseases, chemicals, drugs, and genes extracted from databases of biomedical publications. Most prior studies of this problem focused on the use of static information of terms and largely ignored the temporal dynamics of scientific term relations. Even when the dynamics were considered in a few recent studies, they learned the representations for the scientific terms, rather than focusing on the term-pair relations. Since the HG problem is to predict term-pair connections, it is not enough to know with whom the terms are connected; it is more important to know how the connections have been formed (in a dynamic process). We formulate this HG problem as a future connectivity prediction in a dynamic attributed graph. The key is to capture the temporal evolution of node-pair (term-pair) relations. We propose an inductive edge (node-pair) embedding method named T-PAIR, utilizing both the graphical structure and node attribute to encode the temporal node-pair relationship. We demonstrate the efficiency of the proposed model on three real-world datasets constructed from biomedical publications in the transductive and inductive settings.
UR - http://hdl.handle.net/10754/664664
UR - https://ieeexplore.ieee.org/document/9170911/
U2 - 10.1109/TKDE.2020.3017687
DO - 10.1109/TKDE.2020.3017687
M3 - Article
SN - 2326-3865
SP - 1
EP - 1
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
ER -