Automated machine learning for Genome Wide Association Studies.

Kleanthi Lakiotaki, Zaharias Papadovasilakis, Vincenzo Lagani, Stefanos Fafalios, Paulos Charonyktakis, Michail Tsagris, Ioannis Tsamardinos

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Motivation: Genome Wide Association Studies (GWAS) present several computational and statistical challenges for their data analysis, including knowledge discovery, interpretability, and translation to clinical practice. Results: We develop, apply, and comparatively evaluate an Automated Machine Learning (AutoML) approach, customized for genomic data that delivers reliable predictive and diagnostic models, the set of genetic variants that are important for predictions (called a biosignature), and an estimate of the out-of-sample predictive power. This AutoML approach discovers variants with higher predictive performance compared to standard GWAS methods, computes an individual risk prediction score, generalizes to new, unseen data, is shown to better differentiate causal variants from other highly correlated variants, and enhances knowledge discovery and interpretability by reporting multiple equivalent biosignatures.
Original languageEnglish (US)
JournalBioinformatics (Oxford, England)
DOIs
StatePublished - Sep 6 2023

ASJC Scopus subject areas

  • Biochemistry
  • Computational Theory and Mathematics
  • Computational Mathematics
  • Molecular Biology
  • Statistics and Probability
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Automated machine learning for Genome Wide Association Studies.'. Together they form a unique fingerprint.

Cite this