TY - JOUR
T1 - Accelerating bioactive peptide discovery via mutual information-based meta-learning.
AU - He, Wenjia
AU - Jiang, Yi
AU - Jin, Junru
AU - Li, Zhongshen
AU - Zhao, Jiaojiao
AU - Manavalan, Balachandran
AU - Su, Ran
AU - Gao, Xin
AU - Wei, Leyi
N1 - KAUST Repository Item: Exported on 2022-01-13
Acknowledged KAUST grant number(s): FCC/1/1976-04-01, REI/1/0018-01-01, REI/1/4473-01-01, REI/1/4742-01-01, URF/1/4098-01-01
Acknowledgements: Natural Science Foundation of China (62072329 and 62071278), the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. FCC/1/1976-04-01, URF/1/4098-01-01, REI/1/0018-01-01, REI/1/4473-01-01 and REI/1/4742-01-01.
PY - 2021/12/9
Y1 - 2021/12/9
N2 - Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.
AB - Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.
UR - http://hdl.handle.net/10754/674908
UR - https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbab499/6457168
U2 - 10.1093/bib/bbab499
DO - 10.1093/bib/bbab499
M3 - Article
C2 - 34882225
SN - 1467-5463
JO - Briefings in bioinformatics
JF - Briefings in bioinformatics
ER -