TY - JOUR
T1 - AllelePred: A Simple Allele Frequencies Ensemble Predictor for Different Single Nucleotide Variants
AU - Sobahy, Turki
AU - Motwalli, Olaa
AU - Alazmi, Meshari
N1 - KAUST Repository Item: Exported on 2022-04-26
Acknowledgements: For computational resources, this research used the resources of the Supercomputing Laboratory at King Abdullah University of Science & Technology (KAUST) in Thuwal, Saudi Arabia. Allele frequencies for the Saudi population were made available by the Saudi Human Genome Program (SHGP) in King Abdulaziz City for Science & Technology (KACST). KFSHRC-R (Thanks to the provider Dr. Fowzan S. Alkuraya) shared the two clinical exome samples used in this research. KFSHRC-J (Thanks to the provider Dr. Yousef Hawsawi) shared the two research exome samples used in this study, and suppoted the research
PY - 2022/3/3
Y1 - 2022/3/3
N2 - Background & Objective: Genomic medicine stands to be revolutionized by understanding single nucleotide variants (SNVs) and their expression in single-gene disorders (Mendelian diseases). Computational tools can play a vital role in the exploration of such variations and their pathogenicity. Consequently, we developed the ensemble prediction tool AllelePred to identify deleterious SNVs and disease causative genes. Results: The model utilizes different population genetics backgrounds and restricted criteria for features selection to help generate high accuracy results. In comparison to other tools, such as Eigen, PROVEAN, and fathmm-MKL our classifier achieves higher accuracy (98%), precision (96%), F1 score (93%), and coverage (100%) for different types of coding variants. The new method was also compared against a bioinformatics analytical workflow, which uses gnomAD overall AFs (less than 1%) and CADD (scaled C-score of at least 15). Furthermore, this research highlights the stature of genetic variant sharing and curation. We accumulated a list of highly probable deleterious variants and recommended further experimental validation before medical diagnostic usage. Conclusions: The ensemble prediction tool AllelePred enables increased accuracy in recognizing deleterious SNVs and the genetic determinants in real clinical data.
AB - Background & Objective: Genomic medicine stands to be revolutionized by understanding single nucleotide variants (SNVs) and their expression in single-gene disorders (Mendelian diseases). Computational tools can play a vital role in the exploration of such variations and their pathogenicity. Consequently, we developed the ensemble prediction tool AllelePred to identify deleterious SNVs and disease causative genes. Results: The model utilizes different population genetics backgrounds and restricted criteria for features selection to help generate high accuracy results. In comparison to other tools, such as Eigen, PROVEAN, and fathmm-MKL our classifier achieves higher accuracy (98%), precision (96%), F1 score (93%), and coverage (100%) for different types of coding variants. The new method was also compared against a bioinformatics analytical workflow, which uses gnomAD overall AFs (less than 1%) and CADD (scaled C-score of at least 15). Furthermore, this research highlights the stature of genetic variant sharing and curation. We accumulated a list of highly probable deleterious variants and recommended further experimental validation before medical diagnostic usage. Conclusions: The ensemble prediction tool AllelePred enables increased accuracy in recognizing deleterious SNVs and the genetic determinants in real clinical data.
UR - http://hdl.handle.net/10754/676458
UR - https://ieeexplore.ieee.org/document/9726877/
UR - http://www.scopus.com/inward/record.url?scp=85125701533&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2022.3155659
DO - 10.1109/TCBB.2022.3155659
M3 - Article
C2 - 35239491
SN - 1557-9964
SP - 1
EP - 1
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
ER -