TY - JOUR
T1 - Machine Learning to Predict Standard Enthalpy of Formation of Hydrocarbons
AU - Yalamanchi, Kiran K.
AU - Van Oudenhoven, Vincent C.O.
AU - Tutino, Francesco
AU - Monge Palacios, Manuel
AU - Alshehri, Abdulelah
AU - Gao, Xin
AU - Sarathy, Mani
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: The work at King Abdullah University of Science and Technology (KAUST) was supported by the KAUST Clean Fuels Consortium (KCFC) and its member companies.
PY - 2019/8/29
Y1 - 2019/8/29
N2 - Thermodynamic properites of molecules are used widely in the study of reactive processes. Such properties are typically measured via experiments or calculated by a variety of computational chemistry methods. In this work, machine learning (ML) models for estimation of standard enthalpy of formation at 298.15 K are developed for three classes of acyclic and closed-shell hydrocarbons, viz. alkanes, alkenes, and alkynes. Initially, an extensive literature survey is performed to collect standard enthalpy data for training ML models. A commercial software (Dragon) is used to obtain a wide set of molecular descriptors by providing SMILES strings. The molecular descriptors are used as input features for the ML models. Support vector regression (SVR) and artificial neural networks are used with a two-level K-fold cross-validation (K-fold CV) workflow. The first level is for estimation of accuracy of both the ML models, and the second level is for generation of the final models. The SVR model is selected as the best model based on error estimates over 10-fold CV. The final SVR model is compared against conventional Benson's group additivity for a set of octene isomers from the database, illustrating the advantages of the proposed ML modeling approach.
AB - Thermodynamic properites of molecules are used widely in the study of reactive processes. Such properties are typically measured via experiments or calculated by a variety of computational chemistry methods. In this work, machine learning (ML) models for estimation of standard enthalpy of formation at 298.15 K are developed for three classes of acyclic and closed-shell hydrocarbons, viz. alkanes, alkenes, and alkynes. Initially, an extensive literature survey is performed to collect standard enthalpy data for training ML models. A commercial software (Dragon) is used to obtain a wide set of molecular descriptors by providing SMILES strings. The molecular descriptors are used as input features for the ML models. Support vector regression (SVR) and artificial neural networks are used with a two-level K-fold cross-validation (K-fold CV) workflow. The first level is for estimation of accuracy of both the ML models, and the second level is for generation of the final models. The SVR model is selected as the best model based on error estimates over 10-fold CV. The final SVR model is compared against conventional Benson's group additivity for a set of octene isomers from the database, illustrating the advantages of the proposed ML modeling approach.
UR - http://hdl.handle.net/10754/658596
UR - https://pubs.acs.org/doi/10.1021/acs.jpca.9b04771
UR - http://www.scopus.com/inward/record.url?scp=85072687035&partnerID=8YFLogxK
U2 - 10.1021/acs.jpca.9b04771
DO - 10.1021/acs.jpca.9b04771
M3 - Article
C2 - 31464441
SN - 1089-5639
VL - 123
SP - 8305
EP - 8313
JO - Journal of Physical Chemistry A
JF - Journal of Physical Chemistry A
IS - 38
ER -