State Key Laboratory of Precision Spectroscopy and Department of Physics, Institute of Theoretical and Computational Science, East China Normal University, Shanghai 200062, China.
J Phys Chem A. 2013 Aug 15;117(32):7149-61. doi: 10.1021/jp400779t. Epub 2013 Mar 27.
An electrostatically embedded generalized molecular fractionation with conjugate caps (EE-GMFCC) method is developed for efficient linear-scaling quantum mechanical (QM) calculation of protein energy. This approach is based on our previously proposed GMFCC/MM method (He; et al. J. Chem. Phys. 2006, 124, 184703), In this EE-GMFCC scheme, the total energy of protein is calculated by taking a linear combination of the QM energy of the neighboring residues and the two-body QM interaction energy between non-neighboring residues that are spatially in close contact. All the fragment calculations are embedded in a field of point charges representing the remaining protein environment, which is the major improvement over our previous GMFCC/MM approach. Numerical studies are carried out to calculate the total energies of 18 real three-dimensional proteins of up to 1142 atoms using the EE-GMFCC approach at the HF/6-31G* level. The overall mean unsigned error of EE-GMFCC for the 18 proteins is 2.39 kcal/mol with reference to the full system HF/6-31G* energies. The EE-GMFCC approach is also applied for proteins at the levels of the density functional theory (DFT) and second-order many-body perturbation theory (MP2), also showing only a few kcal/mol deviation from the corresponding full system result. The EE-GMFCC method is linear-scaling with a low prefactor, trivially parallel, and can be readily applied to routinely perform structural optimization of proteins and molecular dynamics simulation with high level ab initio electronic structure theories.
静电嵌入广义分子分段与共轭帽(EE-GMFCC)方法被开发出来,用于高效的蛋白质能量的线性标度量子力学(QM)计算。这种方法是基于我们之前提出的 GMFCC/MM 方法(He 等人,J. Chem. Phys. 2006, 124, 184703)。在这个 EE-GMFCC 方案中,蛋白质的总能量是通过取相邻残基的 QM 能量和空间上密切接触的非相邻残基之间的两体 QM 相互作用能量的线性组合来计算的。所有的片段计算都嵌入在一个代表剩余蛋白质环境的点电荷场中,这是对我们之前的 GMFCC/MM 方法的主要改进。数值研究是通过 HF/6-31G* 水平的 EE-GMFCC 方法对 18 个真实的三维蛋白质(多达 1142 个原子)的总能量进行计算的。EE-GMFCC 对 18 个蛋白质的整体平均无符号误差为 2.39 kcal/mol,与全系统 HF/6-31G* 能量相比。EE-GMFCC 方法也被应用于密度泛函理论(DFT)和二阶多体微扰理论(MP2)水平的蛋白质,也仅偏离相应的全系统结果几个 kcal/mol。EE-GMFCC 方法具有低预因子的线性标度、平凡的并行性,并且可以很容易地应用于使用高级从头算电子结构理论进行蛋白质的结构优化和分子动力学模拟。