TY - JOUR
T1 - Structural variants in 3000 rice genomes
AU - Fuentes, Roven Rommel
AU - Chebotarov, Dmytro
AU - Duitama, Jorge
AU - Smith, Sean
AU - De la Hoz, Juan Fernando
AU - Mohiyuddin, Marghoob
AU - Wing, Rod Anthony
AU - McNally, Kenneth L.
AU - Tatarinova, Tatiana
AU - Grigoriev, Andrey
AU - Mauleon, Ramil
AU - Alexandrov, Nickolai
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: Work in the Grigoriev laboratory is supported by awards from the National Science Foundation (DBI-1458202, ACI-1548562), National Institute of Health (R15CA220059), and New Jersey Health Foundation (PC77-17). We thank XSEDE (MCB150014), PRAGMA, Universidad de los Andes, and the Department of Science and Technology, Advanced Science and Technology Institute of the Philippines for computing resources. We thank the Bill and Melinda Gates Foundation for support to the 3K RG via the GSR Phase 2 award OPPGD1393.
PY - 2019/4/16
Y1 - 2019/4/16
N2 - Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.
AB - Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.
UR - http://hdl.handle.net/10754/652911
UR - https://genome.cshlp.org/content/29/5/870
UR - http://www.scopus.com/inward/record.url?scp=85065509354&partnerID=8YFLogxK
U2 - 10.1101/gr.241240.118
DO - 10.1101/gr.241240.118
M3 - Article
C2 - 30992303
SN - 1088-9051
VL - 29
SP - 870
EP - 880
JO - Genome Research
JF - Genome Research
IS - 5
ER -