Department of Biochemistry, The University of Texas Southwestern Medical Center , 5323 Harry Hines Blvd., Dallas, Texas 75390, USA.
J Chem Inf Model. 2012 May 25;52(5):1199-212. doi: 10.1021/ci300064d. Epub 2012 Apr 24.
It is of great interest in modern drug design to accurately calculate the free energies of protein-ligand or nucleic acid-ligand binding. MM-PBSA (molecular mechanics Poisson-Boltzmann surface area) and MM-GBSA (molecular mechanics generalized Born surface area) have gained popularity in this field. For both methods, the conformational entropy, which is usually calculated through normal-mode analysis (NMA), is needed to calculate the absolute binding free energies. Unfortunately, NMA is computationally demanding and becomes a bottleneck of the MM-PB/GBSA-NMA methods. In this work, we have developed a fast approach to estimate the conformational entropy based upon solvent accessible surface area calculations. In our approach, the conformational entropy of a molecule, S, can be obtained by summing up the contributions of all atoms, no matter they are buried or exposed. Each atom has two types of surface areas, solvent accessible surface area (SAS) and buried SAS (BSAS). The two types of surface areas are weighted to estimate the contribution of an atom to S. Atoms having the same atom type share the same weight and a general parameter k is applied to balance the contributions of the two types of surface areas. This entropy model was parametrized using a large set of small molecules for which their conformational entropies were calculated at the B3LYP/6-31G* level taking the solvent effect into account. The weighted solvent accessible surface area (WSAS) model was extensively evaluated in three tests. For convenience, TS values, the product of temperature T and conformational entropy S, were calculated in those tests. T was always set to 298.15 K through the text. First of all, good correlations were achieved between WSAS TS and NMA TS for 44 protein or nucleic acid systems sampled with molecular dynamics simulations (10 snapshots were collected for postentropy calculations): the mean correlation coefficient squares (R²) was 0.56. As to the 20 complexes, the TS changes upon binding; TΔS values were also calculated, and the mean R² was 0.67 between NMA and WSAS. In the second test, TS values were calculated for 12 proteins decoy sets (each set has 31 conformations) generated by the Rosetta software package. Again, good correlations were achieved for all decoy sets: the mean, maximum, and minimum of R² were 0.73, 0.89, and 0.55, respectively. Finally, binding free energies were calculated for 6 protein systems (the numbers of inhibitors range from 4 to 18) using four scoring functions. Compared to the measured binding free energies, the mean R² of the six protein systems were 0.51, 0.47, 0.40, and 0.43 for MM-GBSA-WSAS, MM-GBSA-NMA, MM-PBSA-WSAS, and MM-PBSA-NMA, respectively. The mean rms errors of prediction were 1.19, 1.24, 1.41, 1.29 kcal/mol for the four scoring functions, correspondingly. Therefore, the two scoring functions employing WSAS achieved a comparable prediction performance to that of the scoring functions using NMA. It should be emphasized that no minimization was performed prior to the WSAS calculation in the last test. Although WSAS is not as rigorous as physical models such as quasi-harmonic analysis and thermodynamic integration (TI), it is computationally very efficient as only surface area calculation is involved and no structural minimization is required. Moreover, WSAS has achieved a comparable performance to normal-mode analysis. We expect that this model could find its applications in the fields like high throughput screening (HTS), molecular docking, and rational protein design. In those fields, efficiency is crucial since there are a large number of compounds, docking poses, or protein models to be evaluated. A list of acronyms and abbreviations used in this work is provided for quick reference.
计算蛋白质配体或核酸配体结合的自由能是现代药物设计中的重要课题。在这个领域中,MM-PBSA(分子力学泊松-玻尔兹曼表面面积)和 MM-GBSA(分子力学广义 Born 表面面积)已经得到了广泛的应用。对于这两种方法,都需要通过正常模式分析(NMA)计算构象熵,以计算绝对结合自由能。不幸的是,NMA 的计算量很大,成为 MM-PB/GBSA-NMA 方法的瓶颈。在这项工作中,我们开发了一种快速估计构象熵的方法,基于溶剂可及表面积的计算。在我们的方法中,分子的构象熵 S 可以通过对所有原子的贡献求和来获得,无论它们是埋藏的还是暴露的。每个原子有两种类型的表面积,溶剂可及表面积(SAS)和埋藏 SAS(BSAS)。这两种类型的表面积被加权以估计一个原子对 S 的贡献。具有相同原子类型的原子共享相同的权重,并应用一个通用参数 k 来平衡两种类型表面积的贡献。这个熵模型使用了一组大量的小分子进行参数化,对于这些小分子,它们的构象熵在考虑溶剂效应的情况下,在 B3LYP/6-31G* 水平上进行了计算。加权溶剂可及表面积(WSAS)模型在三个测试中得到了广泛的评估。为了方便起见,在这些测试中计算了 TS 值,即温度 T 和构象熵 S 的乘积。通过文本,T 始终设置为 298.15 K。首先,对于 44 个用分子动力学模拟采样的蛋白质或核酸系统(为了后熵计算,收集了 10 个快照),WSAS TS 与 NMA TS 之间达到了很好的相关性:平均相关系数平方(R²)为 0.56。对于 20 个复合物,结合时的 TS 发生了变化;也计算了 TΔS 值,NMA 和 WSAS 之间的平均 R²为 0.67。在第二个测试中,为罗塞塔软件包生成的 12 个蛋白质诱饵集(每个集有 31 个构象)计算了 TS 值。同样,对于所有诱饵集都达到了很好的相关性:平均、最大和最小 R²分别为 0.73、0.89 和 0.55。最后,使用四个评分函数计算了 6 个蛋白质系统(抑制剂的数量从 4 到 18 不等)的结合自由能。与测量的结合自由能相比,六个蛋白质系统的平均 R² 分别为 MM-GBSA-WSAS、MM-GBSA-NMA、MM-PBSA-WSAS 和 MM-PBSA-NMA 的 0.51、0.47、0.40 和 0.43。对于四个评分函数,预测的平均均方根误差分别为 1.19、1.24、1.41 和 1.29 kcal/mol。因此,使用 WSAS 的两个评分函数的预测性能与使用 NMA 的评分函数相当。应该强调的是,在最后一个测试中,在计算 WSAS 之前没有进行最小化。尽管 WSAS 不如准谐分析和热力学积分(TI)等物理模型严格,但它的计算效率非常高,因为只涉及表面积计算,不需要结构最小化。此外,WSAS 已经达到了与正常模式分析相当的性能。我们期望这个模型可以在高通量筛选(HTS)、分子对接和合理的蛋白质设计等领域得到应用。在这些领域中,由于需要评估大量的化合物、对接构象或蛋白质模型,效率至关重要。本文提供了一个缩写和简称的列表,以便快速参考。