最大似然法在力场校准中的应用。

A Maximum-Likelihood Approach to Force-Field Calibration.

机构信息

Faculty of Chemistry, University of Gdańsk , ul. Wita Stwosza 63, 80-308 Gdańsk, Poland.

Laboratory of Biopolymer Structure, Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk , Kładki 24, 80-922 Gdańsk, Poland.

出版信息

J Chem Inf Model. 2015 Sep 28;55(9):2050-70. doi: 10.1021/acs.jcim.5b00395. Epub 2015 Aug 20.

DOI:10.1021/acs.jcim.5b00395

PMID:26263302

Abstract

A new approach to the calibration of the force fields is proposed, in which the force-field parameters are obtained by maximum-likelihood fitting of the calculated conformational ensembles to the experimental ensembles of training system(s). The maximum-likelihood function is composed of logarithms of the Boltzmann probabilities of the experimental conformations, calculated with the current energy function. Because the theoretical distribution is given in the form of the simulated conformations only, the contributions from all of the simulated conformations, with Gaussian weights in the distances from a given experimental conformation, are added to give the contribution to the target function from this conformation. In contrast to earlier methods for force-field calibration, the approach does not suffer from the arbitrariness of dividing the decoy set into native-like and non-native structures; however, if such a division is made instead of using Gaussian weights, application of the maximum-likelihood method results in the well-known energy-gap maximization. The computational procedure consists of cycles of decoy generation and maximum-likelihood-function optimization, which are iterated until convergence is reached. The method was tested with Gaussian distributions and then applied to the physics-based coarse-grained UNRES force field for proteins. The NMR structures of the tryptophan cage, a small α-helical protein, determined at three temperatures (T = 280, 305, and 313 K) by Hałabis et al. ( J. Phys. Chem. B 2012 , 116 , 6898 - 6907 ), were used. Multiplexed replica-exchange molecular dynamics was used to generate the decoys. The iterative procedure exhibited steady convergence. Three variants of optimization were tried: optimization of the energy-term weights alone and use of the experimental ensemble of the folded protein only at T = 280 K (run 1); optimization of the energy-term weights and use of experimental ensembles at all three temperatures (run 2); and optimization of the energy-term weights and the coefficients of the torsional and multibody energy terms and use of experimental ensembles at all three temperatures (run 3). The force fields were subsequently tested with a set of 14 α-helical and two α + β proteins. Optimization run 1 resulted in better agreement with the experimental ensemble at T = 280 K compared with optimization run 2 and in comparable performance on the test set but poorer agreement of the calculated folding temperature with the experimental folding temperature. Optimization run 3 resulted in the best fit of the calculated ensembles to the experimental ones for the tryptophan cage but in much poorer performance on the training set, suggesting that use of a small α-helical protein for extensive force-field calibration resulted in overfitting of the data for this protein at the expense of transferability. The optimized force field resulting from run 2 was found to fold 13 of the 14 tested α-helical proteins and one small α + β protein with the correct topologies; the average structures of 10 of them were predicted with accuracies of about 5 Å C(α) root-mean-square deviation or better. Test simulations with an additional set of 12 α-helical proteins demonstrated that this force field performed better on α-helical proteins than the previous parametrizations of UNRES. The proposed approach is applicable to any problem of maximum-likelihood parameter estimation when the contributions to the maximum-likelihood function cannot be evaluated at the experimental points and the dimension of the configurational space is too high to construct histograms of the experimental distributions.

摘要

提出了一种新的力场校准方法，其中通过最大似然拟合计算构象集合与训练系统（多个）的实验集合，获得力场参数。最大似然函数由当前能量函数计算的实验构象的玻尔兹曼概率的对数组成。由于理论分布仅以模拟构象的形式给出，因此，从给定的实验构象以高斯权重的距离添加所有模拟构象的贡献，以给出来自该构象的目标函数的贡献。与以前的力场校准方法不同，该方法不受将诱饵集划分为天然样和非天然样结构的任意性的影响；然而，如果进行这样的划分而不是使用高斯权重，则应用最大似然方法会导致众所周知的能量间隙最大化。计算过程由诱饵生成和最大似然函数优化的循环组成，这些循环迭代直到达到收敛。该方法使用高斯分布进行了测试，然后应用于基于物理的粗粒度 UNRES 力场进行蛋白质。使用了 Hałabis 等人（J. Phys. Chem. B 2012, 116, 6898-6907）在三个温度（T = 280、305 和 313 K）下测定的色氨酸笼（一种小的α-螺旋蛋白）的 NMR 结构。使用复用交换分子动力学生成诱饵。迭代过程表现出稳定的收敛性。尝试了三种优化变体：仅优化能量项权重的优化和仅在 T = 280 K 时使用折叠蛋白的实验集合的优化（运行 1）；优化能量项权重并在所有三个温度下使用实验集合的优化（运行 2）；以及优化能量项权重和扭转和多体能量项的系数，并在所有三个温度下使用实验集合的优化（运行 3）。随后使用一组 14 个α-螺旋蛋白和两个α+β蛋白对力场进行了测试。与运行 2 相比，运行 1 导致与 T = 280 K 时的实验集合更好的一致性，并且在测试集上具有可比的性能，但计算折叠温度与实验折叠温度的一致性较差。运行 3 导致色氨酸笼的计算集合与实验集合的最佳拟合，但在训练集上的性能差得多，这表明使用小的α-螺旋蛋白进行广泛的力场校准会导致数据过度拟合，而牺牲了可转移性。发现来自运行 2 的优化力场可以折叠 14 个测试的α-螺旋蛋白和一个小的α+β蛋白中的 13 个，具有正确的拓扑结构；其中 10 个的平均结构的预测精度约为 5 Å C(α)均方根偏差或更好。使用额外的 12 个α-螺旋蛋白的测试模拟表明，与 UNRES 的先前参数化相比，该力场在α-螺旋蛋白上的性能更好。所提出的方法适用于任何最大似然参数估计问题，当无法在实验点评估最大似然函数的贡献，并且构象空间的维度太高而无法构建实验分布的直方图时。

相似文献

A Maximum-Likelihood Approach to Force-Field Calibration.最大似然法在力场校准中的应用。

J Chem Inf Model. 2015 Sep 28;55(9):2050-70. doi: 10.1021/acs.jcim.5b00395. Epub 2015 Aug 20.

Maximum Likelihood Calibration of the UNRES Force Field for Simulation of Protein Structure and Dynamics.用于蛋白质结构与动力学模拟的UNRES力场的最大似然校准

J Chem Inf Model. 2017 Sep 25;57(9):2364-2377. doi: 10.1021/acs.jcim.7b00254. Epub 2017 Sep 5.

Modification and optimization of the united-residue (UNRES) potential energy function for canonical simulations. I. Temperature dependence of the effective energy function and tests of the optimization method with single training proteins.用于正则模拟的联合残基（UNRES）势能函数的修改与优化。I. 有效能量函数的温度依赖性及对单一训练蛋白优化方法的测试

J Phys Chem B. 2007 Jan 11;111(1):260-85. doi: 10.1021/jp065380a.

Extension of the force-matching method to coarse-grained models with axially symmetric sites to produce transferable force fields: Application to the UNRES model of proteins.将力匹配方法扩展到具有轴对称位点的粗粒度模型，以生成可转移的力场：在 UNRES 蛋白质模型中的应用。

J Chem Phys. 2020 Feb 7;152(5):054902. doi: 10.1063/1.5138991.

A comparative study of two different force fields on structural and thermodynamics character of H1 peptide via molecular dynamics simulations.通过分子动力学模拟对两种不同力场下 H1 肽结构和热力学性质的比较研究。

J Biomol Struct Dyn. 2010 Apr;27(5):651-61. doi: 10.1080/07391102.2010.10508579.

Determination of conformational equilibrium of peptides in solution by NMR spectroscopy and theoretical conformational analysis: application to the calibration of mean-field solvation models.通过核磁共振光谱法和理论构象分析确定溶液中肽的构象平衡：在平均场溶剂化模型校准中的应用。

Biopolymers. 2001;60(2):79-95. doi: 10.1002/1097-0282(2001)60:2<79::AID-BIP1006>3.0.CO;2-L.

An improved functional form for the temperature scaling factors of the components of the mesoscopic UNRES force field for simulations of protein structure and dynamics.一种用于蛋白质结构与动力学模拟的介观UNRES力场各组分温度缩放因子的改进函数形式。

J Phys Chem B. 2009 Jun 25;113(25):8738-44. doi: 10.1021/jp901788q.

A Rigorous and Efficient Method To Reweight Very Large Conformational Ensembles Using Average Experimental Data and To Determine Their Relative Information Content.一种使用平均实验数据重新加权非常大的构象集合并确定其相对信息量的严格有效的方法。

J Chem Theory Comput. 2016 Jan 12;12(1):383-94. doi: 10.1021/acs.jctc.5b00759. Epub 2015 Dec 2.

Physics-based potentials for the coupling between backbone- and side-chain-local conformational states in the UNited RESidue (UNRES) force field for protein simulations.用于蛋白质模拟的联合残基（UNRES）力场中主链和侧链局部构象状态耦合的基于物理的势。

J Chem Theory Comput. 2015 Feb 10;11(2):817-31. doi: 10.1021/ct500736a.

Exploring the parameter space of the coarse-grained UNRES force field by random search: selecting a transferable medium-resolution force field.通过随机搜索探索粗粒度 UNRES 力场的参数空间：选择一种可转移的中等分辨率力场。

J Comput Chem. 2009 Oct;30(13):2127-35. doi: 10.1002/jcc.21215.

引用本文的文献

Decoding Solubility Signatures from Amyloid Monomer Energy Landscapes.从淀粉样蛋白单体能量景观中解码溶解度特征

J Chem Theory Comput. 2025 Mar 11;21(5):2736-2756. doi: 10.1021/acs.jctc.4c01623. Epub 2025 Feb 24.

Secondary Structure in Free and Assisted Modeling of Proteins with the Coarse-Grained UNRES Force Field.使用粗粒 UNRES 力场的蛋白质自由建模和辅助建模中的二级结构。

Methods Mol Biol. 2025;2867:19-41. doi: 10.1007/978-1-0716-4196-5_2.

Free-Docking and Template-Based Docking: Physics Versus Knowledge-Based Docking.自由对接和基于模板的对接：物理与基于知识的对接。

Methods Mol Biol. 2024;2780:27-41. doi: 10.1007/978-1-0716-3985-6_3.

Integrating Explicit and Implicit Fullerene Models into UNRES Force Field for Protein Interaction Studies.将显式和隐式富勒烯模型集成到 UNRES 力场中用于蛋白质相互作用研究。

Molecules. 2024 Apr 23;29(9):1919. doi: 10.3390/molecules29091919.

Long-Time Dynamics of Selected Molecular-Motor Components Using a Physics-Based Coarse-Grained Approach.基于物理的粗粒化方法研究选定分子马达组件的长时间动力学。

Biomolecules. 2023 Jun 5;13(6):941. doi: 10.3390/biom13060941.

Analysis of Protein Folding Simulation with Moving Root Mean Square Deviation.基于均方根偏差移动的蛋白质折叠模拟分析。

J Chem Inf Model. 2023 Mar 13;63(5):1529-1541. doi: 10.1021/acs.jcim.2c01444. Epub 2023 Feb 23.

Improvements and new functionalities of UNRES server for coarse-grained modeling of protein structure, dynamics, and interactions.用于蛋白质结构、动力学和相互作用粗粒度建模的UNRES服务器的改进及新功能。

Front Mol Biosci. 2022 Dec 14;9:1071428. doi: 10.3389/fmolb.2022.1071428. eCollection 2022.

Theory and Practice of Coarse-Grained Molecular Dynamics of Biologically Important Systems.生物重要体系的粗粒分子动力学的理论与实践。

Biomolecules. 2021 Sep 11;11(9):1347. doi: 10.3390/biom11091347.

Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins.可微分子模拟可以学习蛋白质粗粒力场中的所有参数。

PLoS One. 2021 Sep 2;16(9):e0256990. doi: 10.1371/journal.pone.0256990. eCollection 2021.

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours.基于轨迹的训练可实现在 CPU 时间内对具有准确折叠和玻尔兹曼系综的蛋白质进行模拟。

PLoS Comput Biol. 2018 Dec 27;14(12):e1006578. doi: 10.1371/journal.pcbi.1006578. eCollection 2018 Dec.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

最大似然法在力场校准中的应用。

A Maximum-Likelihood Approach to Force-Field Calibration.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献