TY - JOUR
T1 - Leveraging 2D molecular graph pretraining for improved 3D conformer generation with graph neural networks
AU - Alhamoud, Kumail
AU - Ghunaim, Yasir
AU - Alshehri, Abdulelah S.
AU - Li, Guohao
AU - Ghanem, Bernard
AU - You, Fengqi
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/4
Y1 - 2024/4
N2 - Predicting stable 3D molecular conformations from 2D molecular graphs is a challenging and resource-intensive task, yet it is critical for various applications, particularly drug design. Density functional theory (DFT) calculations set the standard for molecular conformation generation, yet they are computationally intensive. Deep learning offers more computationally efficient approaches, but struggles to match DFT accuracy, particularly on complex drug-like structures. Additionally, the steep computational demands of assembling 3D molecular datasets constrain the broader adoption of deep learning. This work aims to utilize the abundant 2D molecular graph datasets for pretraining a machine learning model, a step that involves initially training the model on a different task with a wealth of data before fine-tuning it for the target task of 3D conformation generation. We build on GeoMol, an end-to-end graph neural network (GNN) method for predicting atomic 3D structures and torsion angles. We examine the limitations of the GeoMol method and introduce new baselines to enhance molecular graph embeddings. Our computational results show that 2D molecular graph pretraining enhances the quality of generated 3D conformers, yielding a 7.7 % average improvement over state-of-the-art sequential methods. These advancements not only facilitate superior 3D conformation generation but also emphasize the potential of leveraging pretrained graph embeddings to boost performance in 3D chemical tasks with GNNs.
AB - Predicting stable 3D molecular conformations from 2D molecular graphs is a challenging and resource-intensive task, yet it is critical for various applications, particularly drug design. Density functional theory (DFT) calculations set the standard for molecular conformation generation, yet they are computationally intensive. Deep learning offers more computationally efficient approaches, but struggles to match DFT accuracy, particularly on complex drug-like structures. Additionally, the steep computational demands of assembling 3D molecular datasets constrain the broader adoption of deep learning. This work aims to utilize the abundant 2D molecular graph datasets for pretraining a machine learning model, a step that involves initially training the model on a different task with a wealth of data before fine-tuning it for the target task of 3D conformation generation. We build on GeoMol, an end-to-end graph neural network (GNN) method for predicting atomic 3D structures and torsion angles. We examine the limitations of the GeoMol method and introduce new baselines to enhance molecular graph embeddings. Our computational results show that 2D molecular graph pretraining enhances the quality of generated 3D conformers, yielding a 7.7 % average improvement over state-of-the-art sequential methods. These advancements not only facilitate superior 3D conformation generation but also emphasize the potential of leveraging pretrained graph embeddings to boost performance in 3D chemical tasks with GNNs.
KW - 3D molecular modeling
KW - Conformation generation
KW - Drug design
KW - Graph neural networks
KW - Pretraining molecular graph embeddings
UR - http://www.scopus.com/inward/record.url?scp=85184909370&partnerID=8YFLogxK
U2 - 10.1016/j.compchemeng.2024.108622
DO - 10.1016/j.compchemeng.2024.108622
M3 - Article
AN - SCOPUS:85184909370
SN - 0098-1354
VL - 183
JO - Computers and Chemical Engineering
JF - Computers and Chemical Engineering
M1 - 108622
ER -