TY - JOUR
T1 - AB-Gen: Antibody Library Design with Generative Pre-trained Transformer and Deep Reinforcement Learning.
AU - Xu, Xiaopeng
AU - Xu, Tiantian
AU - Zhou, Juexiao
AU - Liao, Xingyu
AU - Zhang, Ruochi
AU - Wang, Yu
AU - Zhang, Lu
AU - Gao, Xin
N1 - KAUST Repository Item: Exported on 2023-07-12
Acknowledged KAUST grant number(s): FCC/1/1976-44-01
Acknowledgements: This work was supported in part by the Office of Research Administration (ORA), King Abdullah University of Science and Technology (KAUST), Saudi Arabia, under Grant FCC/1/1976-44-01. We acknowledge the constructive comments from the reviewers to improve the quality of this work.
PY - 2023/6/24
Y1 - 2023/6/24
N2 - Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a reinforcement learning (RL) method, named AB-Gen, for antibody library design using a generative pre-trained Transformer (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. 509 generated sequences were able to pass all property filters and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, which consolidated that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process.
AB - Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a reinforcement learning (RL) method, named AB-Gen, for antibody library design using a generative pre-trained Transformer (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. 509 generated sequences were able to pass all property filters and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, which consolidated that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process.
UR - http://hdl.handle.net/10754/690552
UR - https://linkinghub.elsevier.com/retrieve/pii/S167202292300092X
U2 - 10.1016/j.gpb.2023.03.004
DO - 10.1016/j.gpb.2023.03.004
M3 - Article
C2 - 37364719
SN - 1672-0229
JO - Genomics, Proteomics & Bioinformatics
JF - Genomics, Proteomics & Bioinformatics
ER -