TY - JOUR
T1 - Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning
AU - Teng, Haotian
AU - Cao, Minh Duc
AU - Hall, Michael B
AU - Duarte, Tania
AU - Wang, Sheng
AU - Coin, Lachlan J M
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: We thank Jianhua Guo for contributing the DNA for the E. coli sample. We thank Arnold Bainomugisa for extracting DNA for the M. tuberculosis sample. We thank Sheng Wang and Han Qiao for the helpful discussion. We thank Jain et al. [14] for the open Human nanopore dataset.
PY - 2018/4/10
Y1 - 2018/4/10
N2 - Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.
AB - Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.
UR - http://hdl.handle.net/10754/627895
UR - https://academic.oup.com/gigascience/article/7/5/giy037/4966989#116612803
UR - http://www.scopus.com/inward/record.url?scp=85056709264&partnerID=8YFLogxK
U2 - 10.1093/gigascience/giy037
DO - 10.1093/gigascience/giy037
M3 - Article
C2 - 29648610
SN - 2047-217X
VL - 7
JO - GigaScience
JF - GigaScience
IS - 5
ER -