TY - JOUR
T1 - An efficient Bayesian kinetic lumping algorithm to identify metastable conformational states via Gibbs sampling
AU - Wang, Wei
AU - Liang, Tong
AU - Sheong, Fu Kit
AU - Fan, Xiaodan
AU - Huang, Xuhui
N1 - KAUST Repository Item: Exported on 2022-06-09
Acknowledged KAUST grant number(s): OSR-2016-CRG5-3007
Acknowledgements: We thank Dr. Lizhe Zhu for helpful discussions on this manuscript. We are grateful to D. E. Shaw Research for sharing their FiP35 trajectories. This work was supported by Shenzhen Science and Technology Innovation Committee (No. JCYJ20170413173837121), the Hong Kong Research Grant Council (Grant Nos. HKUST C6009-15G, 14203915, 16307718, 16302214, 16304215, 16318816, and AoE/P-705/16), the Chinese University of Hong Kong direct Grant (No. 3132753), King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) (No. OSR-2016-CRG5-3007), the Guangzhou Science Technology and Innovation Commission (No. 201704030116), and the Innovation and Technology Commission (Nos. ITCPD/17-9 and ITC-CNERC14SC01). X.H. is the Padma Harilela Associate Professor of Science. This research made use of the resources of the Supercomputing Laboratory at KAUST. W.W. acknowledges support from the Hong Kong Ph.D. Fellowship Scheme 2014/15 (No. PF13-14699).
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2018/8/9
Y1 - 2018/8/9
N2 - Markov State Model (MSM) has become a popular approach to study the conformational dynamics of complex biological systems in recent years. Built upon a large number of short molecular dynamics simulation trajectories, MSM is able to predict the long time scale dynamics of complex systems. However, to achieve Markovianity, an MSM often contains hundreds or thousands of states (microstates), hindering human interpretation of the underlying system mechanism. One way to reduce the number of states is to lump kinetically similar states together and thus coarse-grain the microstates into macrostates. In this work, we introduce a probabilistic lumping algorithm, the Gibbs lumping algorithm, to assign a probability to any given kinetic lumping using the Bayesian inference. In our algorithm, the transitions among kinetically distinct macrostates are modeled by Poisson processes, which will well reflect the separation of time scales in the underlying free energy landscape of biomolecules. Furthermore, to facilitate the search for the optimal kinetic lumping (i.e., the lumped model with the highest probability), a Gibbs sampling algorithm is introduced. To demonstrate the power of our new method, we apply it to three systems: a 2D potential, alanine dipeptide, and a WW protein domain. In comparison with six other popular lumping algorithms, we show that our method can persistently produce the lumped macrostate model with the highest probability as well as the largest metastability. We anticipate that our Gibbs lumping algorithm holds great promise to be widely applied to investigate conformational changes in biological macromolecules.
AB - Markov State Model (MSM) has become a popular approach to study the conformational dynamics of complex biological systems in recent years. Built upon a large number of short molecular dynamics simulation trajectories, MSM is able to predict the long time scale dynamics of complex systems. However, to achieve Markovianity, an MSM often contains hundreds or thousands of states (microstates), hindering human interpretation of the underlying system mechanism. One way to reduce the number of states is to lump kinetically similar states together and thus coarse-grain the microstates into macrostates. In this work, we introduce a probabilistic lumping algorithm, the Gibbs lumping algorithm, to assign a probability to any given kinetic lumping using the Bayesian inference. In our algorithm, the transitions among kinetically distinct macrostates are modeled by Poisson processes, which will well reflect the separation of time scales in the underlying free energy landscape of biomolecules. Furthermore, to facilitate the search for the optimal kinetic lumping (i.e., the lumped model with the highest probability), a Gibbs sampling algorithm is introduced. To demonstrate the power of our new method, we apply it to three systems: a 2D potential, alanine dipeptide, and a WW protein domain. In comparison with six other popular lumping algorithms, we show that our method can persistently produce the lumped macrostate model with the highest probability as well as the largest metastability. We anticipate that our Gibbs lumping algorithm holds great promise to be widely applied to investigate conformational changes in biological macromolecules.
UR - http://hdl.handle.net/10754/678793
UR - http://aip.scitation.org/doi/10.1063/1.5027001
UR - http://www.scopus.com/inward/record.url?scp=85051503769&partnerID=8YFLogxK
U2 - 10.1063/1.5027001
DO - 10.1063/1.5027001
M3 - Article
SN - 0021-9606
VL - 149
SP - 072337
JO - Journal of Chemical Physics
JF - Journal of Chemical Physics
IS - 7
ER -