Abstract
BackgroundPolyadenylation is a critical stage of RNA processing during the formation of mature mRNA, and is present in most of the known eukaryote protein-coding transcripts and many long non-coding RNAs. The correct identification of poly(A) signals (PAS) not only helps to elucidate the 3′-end genomic boundaries of a transcribed DNA region and gene regulatory mechanisms but also gives insight into the multiple transcript isoforms resulting from alternative PAS. Although progress has been made in the in-silico prediction of genomic signals, the recognition of PAS in DNA genomic sequences remains a challenge.ResultsIn this study, we analyzed human genomic DNA sequences for the 12 most common PAS variants. Our analysis has identified a set of features that helps in the recognition of true PAS, which may be involved in the regulation of the polyadenylation process. The proposed features, in combination with a recognition model, resulted in a novel method and tool, Omni-PolyA. Omni-PolyA combines several machine learning techniques such as different classifiers in a tree-like decision structure and genetic algorithms for deriving a robust classification model. We performed a comparison between results obtained by state-of-the-art methods, deep neural networks, and Omni-PolyA. Results show that Omni-PolyA significantly reduced the average classification error rate by 35.37% in the prediction of the 12 considered PAS variants relative to the state-of-the-art results.ConclusionsThe results of our study demonstrate that Omni-PolyA is currently the most accurate model for the prediction of PAS in human and can serve as a useful complement to other PAS recognition methods. Omni-PolyA is publicly available as an online tool accessible at www.cbrc.kaust.edu.sa/omnipolya/.
Original language | English (US) |
---|---|
Journal | BMC Genomics |
Volume | 18 |
Issue number | 1 |
DOIs | |
State | Published - Aug 15 2017 |
Fingerprint
Dive into the research topics of 'Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA'. Together they form a unique fingerprint.Datasets
-
Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA
Magana-Mora, A. (Creator), Kalkatawi, M. M. (Creator) & Bajic, V. B. (Creator), figshare, 2017
DOI: 10.6084/m9.figshare.c.3854206.v1, http://hdl.handle.net/10754/663799
Dataset