Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China.
National Laboratory of Solid State Microstructure, Department of Physics, and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China.
J Chem Theory Comput. 2020 Sep 8;16(9):5936-5947. doi: 10.1021/acs.jctc.0c00340. Epub 2020 Aug 25.
The human telomeric DNA G-quadruplex follows a kinetic partitioning folding mechanism. The underlying folding landscape potentially has many minima separated by high free-energy barriers. However, using current theoretical models to characterize this complex folding landscape has remained a challenging problem. In this study, by developing a hybrid atomistic structure-based model that merges structural information on the hybrid-1, hybrid-2, and chair-type G-quadruplex topologies, we investigated a kinetic partitioning folding process of human telomeric DNA involving three native folds. The model was validated as it reproduced the experimental observation that the hybrid-1 conformation is the major fold and the hybrid-2 conformation is kinetically more accessible. A three-step mechanism was revealed for the formation of the hybrid-1 conformation, while a two-step mechanism was demonstrated for the formation of hybrid-2 and chair-type conformations. Likewise, a class of state in which structures adopted inappropriate combinations of syn/anti guanine nucleotides was found to greatly slow down the folding process. In addition, by employing the XGBoost machine learning algorithm, three interatom distances and six dihedral angles were identified as essential internal coordinates to represent the low-dimensional folding landscape. The strategy of coupling the multibasin model and the machine learning algorithm may be useful to investigate the conformational dynamics of other multistate biomolecules.
人类端粒 DNA G-四链体遵循动力学分区折叠机制。潜在的折叠景观具有许多由高自由能势垒隔开的极小值。然而,使用当前的理论模型来描述这种复杂的折叠景观仍然是一个具有挑战性的问题。在这项研究中,通过开发一种混合原子结构模型,该模型融合了混合-1、混合-2 和椅式 G-四链体拓扑结构的结构信息,我们研究了涉及三种天然折叠的人类端粒 DNA 的动力学分区折叠过程。该模型通过复制实验观察结果得到了验证,即混合-1 构象是主要构象,而混合-2 构象在动力学上更容易接近。揭示了形成混合-1 构象的三步机制,而形成混合-2 和椅式构象的机制则证明为两步机制。同样,发现结构采用不合适的顺/反鸟嘌呤核苷酸组合的一类状态会大大减慢折叠过程。此外,通过使用 XGBoost 机器学习算法,确定了三个原子间距离和六个二面角作为表示低维折叠景观的必需内部坐标。耦合多盆地模型和机器学习算法的策略可能有助于研究其他多态生物分子的构象动力学。