TY - GEN
T1 - Transmembrane topology identification by fusing evolutionary and co-evolutionary information with cascaded bidirectional transformers
AU - Li, Zhen
AU - Ni, Chongming
AU - Xu, Jinbo
AU - Gao, Xin
AU - Cui, Shuguang
AU - Wang, Sheng
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work is supported by Shenzhen Fundamental Research Fund under grants No. KQTD2015033114415450 and No. ZDSYS201707251409055, and by grant No. 2017ZT07X152. This work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Awards No. FCC/1/1976 17 01, FCC/1/19 76 18 01, FCC/1/1976 23 01, FCC/1/1976 25 01, FCC/ 1/1976 26 01, and URF/1/3450 01 to X.G. This work was also supported by National Institutes of Health (NI-H) [R01GM089753] and National Science Foundation (NSF) [DBI 1564955] to J.X
PY - 2019/9/9
Y1 - 2019/9/9
N2 - The transmembrane topology is the key to understand the 3D structures of multi-pass Transmembrane Proteins (mTPs). However, accurate prediction of the 1D topology label for each residue of an mTP from evolutionary information alone is very challenging, if not infeasible. In this work, we propose a novel approach to identify the transmembrane topology under an object detection framework that takes as the input the predicted 2D distance matrix from the co-evolutionary information, followed by several bidirectional Transformer blocks that effectively fuse both 2D and 1D features for accurate label prediction. Specifically, we employ the Faster-RCNN module to simultaneously predict the rectangular bounds that cover the interacted transmembrane regions, as well as the confidence scores to discriminate them from the non-transmembrane regions. To integrate the 2D pairwise features and the 1D sequential features, we establish several bidirectional Transformer blocks consisting of self-attention units for capturing long-range dependencies in the transmembrane topology. Tested on the 330 non-redundant mTPs and the newly released 45 mTPs, in terms of the Segment OVerlap (SOV) score, our approach achieves 0.927 and 0.843, which are about 4.5% and 6.6% better than the cutting-edge consensus methods, respectively.
AB - The transmembrane topology is the key to understand the 3D structures of multi-pass Transmembrane Proteins (mTPs). However, accurate prediction of the 1D topology label for each residue of an mTP from evolutionary information alone is very challenging, if not infeasible. In this work, we propose a novel approach to identify the transmembrane topology under an object detection framework that takes as the input the predicted 2D distance matrix from the co-evolutionary information, followed by several bidirectional Transformer blocks that effectively fuse both 2D and 1D features for accurate label prediction. Specifically, we employ the Faster-RCNN module to simultaneously predict the rectangular bounds that cover the interacted transmembrane regions, as well as the confidence scores to discriminate them from the non-transmembrane regions. To integrate the 2D pairwise features and the 1D sequential features, we establish several bidirectional Transformer blocks consisting of self-attention units for capturing long-range dependencies in the transmembrane topology. Tested on the 330 non-redundant mTPs and the newly released 45 mTPs, in terms of the Segment OVerlap (SOV) score, our approach achieves 0.927 and 0.843, which are about 4.5% and 6.6% better than the cutting-edge consensus methods, respectively.
UR - http://hdl.handle.net/10754/660599
UR - http://dl.acm.org/citation.cfm?doid=3307339.3342140
UR - http://www.scopus.com/inward/record.url?scp=85073154516&partnerID=8YFLogxK
U2 - 10.1145/3307339.3342140
DO - 10.1145/3307339.3342140
M3 - Conference contribution
SN - 9781450366663
SP - 136
EP - 143
BT - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics - BCB '19
PB - ACM Press
ER -