TY - GEN
T1 - Pairwise structure alignment specifically tuned for surface pockets and interaction interfaces
AU - Cui, Xuefeng
AU - Naveed, Hammad
AU - Gao, Xin
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2015/9/29
Y1 - 2015/9/29
N2 - To detect and evaluate the similarities between the three-dimensional (3D) structures of two molecules, various kinds of methods have been proposed for the pairwise structure alignment problem [6, 9, 7, 11]. The problem plays important roles when studying the function and the evolution of biological molecules. Recently, pairwise structure alignment methods have been extended and applied on surface pocket structures [10, 3, 5] and interaction interface structures [8, 4]. The results show that, even when there are no global similarities discovered between the global sequences and the global structures, biological molecules or complexes could share similar functions because of well conserved pockets and interfaces. Thus, pairwise pocket and interface structure alignments are promising to unveil such shared functions that cannot be discovered by the well-studied global sequence and global structure alignments.
State-of-the-art methods for pairwise pocket and interface structure alignments [4, 5] are direct extensions of the classic pairwise protein structure alignment methods, and thus such methods share a few limitations. First, the goal of the classic protein structure alignment methods is to align single-chain protein structures (i.e., a single fragment of residues connected by peptide bonds). However, we observed that pockets and interfaces tend to consist of tens of extremely short backbone fragments (i.e., three or fewer residues connected by peptide bonds). Thus, existing pocket and interface alignment methods based on the protein structure alignment methods still rely on the existence of long-enough backbone fragments, and the fragmentation issue of pockets and interfaces rises the risk of missing the optimal alignments. Moreover, existing interface structure alignment methods focus on protein-protein interfaces, and require a "blackbox preprocessing" before aligning protein-DNA and protein-RNA interfaces. Therefore, we introduce the PROtein STucture Alignment (PROSTA) family of pairwise structure alignment methods [1, 2] that address the fragmentation issue of pockets and interfaces, and automatically align interfaces between any types of biological complexes.
Our PROSTA structure alignment methods have two critical advantages comparing to existing structure alignment methods. First, our methods are completely sequence order independent, which is critical to the success of pairwise pocket and interface structure alignments. This is achieved by introducing contact groups that are not limited to backbone fragments, and by employing a maximum weighted bipartite matching solver from the beginning of the alignment process. In addition, our methods incorporate similarities of sequentially and structurally remote residues that potentially model the topology of the global structure. Comparing to existing methods that focus on local structure or whole sequence similarities, topological similarities are more reliable to find near-optimal structure alignments in the initial alignment state. As a result, a significant number of similar pockets and interfaces are newly discovered, and literatures also support that similar functions are shared between biological complexes in our cases studies.
The PROSTA web-server and source codes are publicly available at "http://www.cbrc.kaust.edu.sa/prosta/".
AB - To detect and evaluate the similarities between the three-dimensional (3D) structures of two molecules, various kinds of methods have been proposed for the pairwise structure alignment problem [6, 9, 7, 11]. The problem plays important roles when studying the function and the evolution of biological molecules. Recently, pairwise structure alignment methods have been extended and applied on surface pocket structures [10, 3, 5] and interaction interface structures [8, 4]. The results show that, even when there are no global similarities discovered between the global sequences and the global structures, biological molecules or complexes could share similar functions because of well conserved pockets and interfaces. Thus, pairwise pocket and interface structure alignments are promising to unveil such shared functions that cannot be discovered by the well-studied global sequence and global structure alignments.
State-of-the-art methods for pairwise pocket and interface structure alignments [4, 5] are direct extensions of the classic pairwise protein structure alignment methods, and thus such methods share a few limitations. First, the goal of the classic protein structure alignment methods is to align single-chain protein structures (i.e., a single fragment of residues connected by peptide bonds). However, we observed that pockets and interfaces tend to consist of tens of extremely short backbone fragments (i.e., three or fewer residues connected by peptide bonds). Thus, existing pocket and interface alignment methods based on the protein structure alignment methods still rely on the existence of long-enough backbone fragments, and the fragmentation issue of pockets and interfaces rises the risk of missing the optimal alignments. Moreover, existing interface structure alignment methods focus on protein-protein interfaces, and require a "blackbox preprocessing" before aligning protein-DNA and protein-RNA interfaces. Therefore, we introduce the PROtein STucture Alignment (PROSTA) family of pairwise structure alignment methods [1, 2] that address the fragmentation issue of pockets and interfaces, and automatically align interfaces between any types of biological complexes.
Our PROSTA structure alignment methods have two critical advantages comparing to existing structure alignment methods. First, our methods are completely sequence order independent, which is critical to the success of pairwise pocket and interface structure alignments. This is achieved by introducing contact groups that are not limited to backbone fragments, and by employing a maximum weighted bipartite matching solver from the beginning of the alignment process. In addition, our methods incorporate similarities of sequentially and structurally remote residues that potentially model the topology of the global structure. Comparing to existing methods that focus on local structure or whole sequence similarities, topological similarities are more reliable to find near-optimal structure alignments in the initial alignment state. As a result, a significant number of similar pockets and interfaces are newly discovered, and literatures also support that similar functions are shared between biological complexes in our cases studies.
The PROSTA web-server and source codes are publicly available at "http://www.cbrc.kaust.edu.sa/prosta/".
UR - http://hdl.handle.net/10754/579616
UR - http://dl.acm.org/citation.cfm?doid=2808719.2811431
U2 - 10.1145/2808719.2811431
DO - 10.1145/2808719.2811431
M3 - Conference contribution
SN - 9781450338530
BT - Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics - BCB '15
PB - Association for Computing Machinery (ACM)
ER -