TY - JOUR
T1 - HiTE
T2 - a fast and accurate dynamic boundary adjustment approach for full-length transposable element detection and annotation
AU - Hu, Kang
AU - Ni, Peng
AU - Xu, Minghua
AU - Zou, You
AU - Chang, Jianye
AU - Gao, Xin
AU - Li, Yaohang
AU - Ruan, Jue
AU - Hu, Bin
AU - Wang, Jianxin
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, HiTE has identified numerous novel transposons with well-defined structures containing protein-coding domains, some of which are directly inserted within crucial genes, leading to direct alterations in gene expression. A Nextflow version of HiTE is also available, with enhanced parallelism, reproducibility, and portability.
AB - Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, HiTE has identified numerous novel transposons with well-defined structures containing protein-coding domains, some of which are directly inserted within crucial genes, leading to direct alterations in gene expression. A Nextflow version of HiTE is also available, with enhanced parallelism, reproducibility, and portability.
UR - http://www.scopus.com/inward/record.url?scp=85197465401&partnerID=8YFLogxK
U2 - 10.1038/s41467-024-49912-8
DO - 10.1038/s41467-024-49912-8
M3 - Article
C2 - 38956036
AN - SCOPUS:85197465401
SN - 2041-1723
VL - 15
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 5573
ER -