Suppr超能文献

利用基于广义能量的片段化方法和机器学习构建蛋白质的量子力学质量力场。

Building quantum mechanics quality force fields of proteins with the generalized energy-based fragmentation approach and machine learning.

机构信息

Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.

出版信息

Phys Chem Chem Phys. 2022 Jan 19;24(3):1326-1337. doi: 10.1039/d1cp03934b.

Abstract

We combined our generalized energy-based fragmentation (GEBF) approach and machine learning (ML) technique to construct quantum mechanics (QM) quality force fields for proteins. In our scheme, the training sets for a protein are only constructed from its small subsystems, which capture all short-range interactions in the target system. The energy of a given protein is expressed as the summation of atomic contributions from QM calculations of various subsystems, corrected by long-range Coulomb and van der Waals interactions. With the Gaussian approximation potential (GAP) method, our protocol can automatically generate training sets with high efficiency. To facilitate the construction of training sets for proteins, we store all trained subsystem data in a library. If subsystems in the library are detected in a new protein, corresponding datasets can be directly reused as a part of the training set on this new protein. With two polypeptides, 4ZNN and 1XQ8 segment, as examples, the energies and forces predicted by GEBF-GAP are in good agreement with those from conventional QM calculations, and dihedral angle distributions from GEBF-GAP molecular dynamics (MD) simulations can also well reproduce those from MD simulations. In addition, with the training set generated from GEBF-GAP, we also demonstrate that GEBF-ML force fields constructed by neural network (NN) methods can also show QM quality. Therefore, the present work provides an efficient and systematic way to build QM quality force fields for biological systems.

摘要

我们结合了广义基于能量的碎裂(GEBF)方法和机器学习(ML)技术,为蛋白质构建量子力学(QM)质量力场。在我们的方案中,蛋白质的训练集仅由其小的子系统构建,这些子系统捕获目标系统中的所有短程相互作用。给定蛋白质的能量表示为来自各个子系统的 QM 计算的原子贡献的总和,由长程库仑和范德华相互作用校正。使用高斯近似势能(GAP)方法,我们的方案可以自动高效地生成训练集。为了方便蛋白质训练集的构建,我们将所有训练的子系统数据存储在一个库中。如果在新蛋白质中检测到库中的子系统,可以直接将相应的数据集重复用作该新蛋白质训练集的一部分。以 4ZNN 和 1XQ8 多肽片段为例,GEBF-GAP 预测的能量和力与传统 QM 计算的结果非常吻合,并且 GEBF-GAP 分子动力学(MD)模拟的二面角分布也可以很好地再现 MD 模拟的结果。此外,使用 GEBF-GAP 生成的训练集,我们还证明了通过神经网络(NN)方法构建的 GEBF-ML 力场也可以表现出 QM 质量。因此,本工作为生物系统构建 QM 质量力场提供了一种高效和系统的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验