TY - JOUR
T1 - Comparative homology agreement search
T2 - An effective combination of homology-search methods
AU - Alam, Intikhab
AU - Dress, Andreas
AU - Rehmsmeier, Marc
AU - Fuellen, Georg
PY - 2004/9/21
Y1 - 2004/9/21
N2 - Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (CHASE) that integrates different search strategies to obtain a combined "E value." Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28-38%.
AB - Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (CHASE) that integrates different search strategies to obtain a combined "E value." Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28-38%.
UR - http://www.scopus.com/inward/record.url?scp=4644269503&partnerID=8YFLogxK
U2 - 10.1073/pnas.0405612101
DO - 10.1073/pnas.0405612101
M3 - Article
C2 - 15367730
AN - SCOPUS:4644269503
SN - 0027-8424
VL - 101
SP - 13814
EP - 13819
JO - PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
JF - PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
IS - 38
ER -