TY - JOUR
T1 - STATegra: Multi-Omics Data Integration - A Conceptual Scheme With a Bioinformatics Pipeline.
AU - Planell, Nuria
AU - Lagani, Vincenzo
AU - Sebastian-Leon, Patricia
AU - van der Kloet, Frans
AU - Ewing, Ewoud
AU - Karathanasis, Nestoras
AU - Urdangarin, Arantxa
AU - Arozarena, Imanol
AU - Jagodic, Maja
AU - Tsamardinos, Ioannis
AU - Tarazona, Sonia
AU - Conesa, Ana
AU - Tegner, Jesper
AU - Gomez-Cabrero, David
N1 - KAUST Repository Item: Exported on 2021-03-25
Acknowledgements: We thank all members of the STATegra consortium for their contributions to this work.
PY - 2021/3/22
Y1 - 2021/3/22
N2 - Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.
AB - Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.
UR - http://hdl.handle.net/10754/666319
UR - https://www.frontiersin.org/articles/10.3389/fgene.2021.620453/full
U2 - 10.3389/fgene.2021.620453
DO - 10.3389/fgene.2021.620453
M3 - Article
C2 - 33747045
SN - 1664-8021
VL - 12
JO - Frontiers in genetics
JF - Frontiers in genetics
ER -