Hostaš Jiří, Jakubec Dávid, Laskowski Roman A, Gnanasekaran Ramachandran, Řezáč Jan, Vondrášek Jiří, Hobza Pavel
Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic , 166 10 Prague, Czech Republic.
Department of Physical and Macromolecular Chemistry, Faculty of Science, Charles University in Prague , Albertov 6, 128 43 Prague, Czech Republic.
J Chem Theory Comput. 2015 Sep 8;11(9):4086-92. doi: 10.1021/acs.jctc.5b00398. Epub 2015 Aug 6.
Representative pairs of amino acid side chains and nucleic acid bases extracted from available high-quality structures of protein-DNA complexes were analyzed using a range of computational methods. CCSD(T)/CBS interaction energies were calculated for the chosen 272 pairs. These reference interaction energies were used to test the MP2.5/CBS, MP2.X/CBS, MP2-F12, DFT-D3, PM6, and Amber force field methods. Method MP2.5 provided excellent agreement with reference data (root-mean-square error (RMSE) of 0.11 kcal/mol), which is more than 1 order of magnitude faster than the CCSD(T) method. When MP2-F12 and MP2.5 were combined, the results were within reasonable accuracy (0.20 kcal/mol), with a computational savings of almost 2 orders of magnitude. Therefore, this method is a promising tool for accurate calculations of interaction energies in protein-DNA motifs of up to ∼100 atoms, for which CCSD(T)/CBS benchmark calculations are not feasible. B3-LYP-D3 calculated with def2-TZVPP and def2-QZVP basis sets yielded sufficiently good results with a reasonably small RMSE. This method provided better results for neutral systems, whereas positively charged species exhibited the worst agreement with the benchmark data. The Amber force field yielded unbalanced results-performing well for systems containing nonpolar amino acids but severely underestimating interaction energies for charged complexes. The semiempirical PM6 method with corrections for hydrogen bonding and dispersion energy (PM6-D3H4) exhibited considerably smaller error than the Amber force field, which makes it an effective tool for modeling extended protein-ligand complexes (of up to 10,000 atoms).
从现有的高质量蛋白质 - DNA 复合物结构中提取代表性的氨基酸侧链和核酸碱基对,并使用一系列计算方法进行分析。对所选的 272 对计算了 CCSD(T)/CBS 相互作用能。这些参考相互作用能用于测试 MP2.5/CBS、MP2.X/CBS、MP2-F12、DFT-D3、PM6 和 Amber 力场方法。方法 MP2.5 与参考数据具有出色的一致性(均方根误差 (RMSE) 为 0.11 kcal/mol),比 CCSD(T) 方法快 1 个多数量级。当 MP2-F12 和 MP2.5 结合使用时,结果在合理的精度范围内(0.20 kcal/mol),计算量节省了近 2 个数量级。因此,该方法是一种有前途的工具,可用于准确计算多达约 100 个原子的蛋白质 - DNA 基序中的相互作用能,对于此类计算,CCSD(T)/CBS 基准计算不可行。使用 def2-TZVPP 和 def2-QZVP 基组计算的 B3-LYP-D3 产生了足够好的结果,RMSE 合理地小。该方法对中性系统提供了更好的结果,而带正电的物种与基准数据的一致性最差。Amber 力场产生的结果不均衡——对于含有非极性氨基酸的系统表现良好,但严重低估了带电复合物的相互作用能。具有氢键和色散能校正的半经验 PM6 方法(PM6-D3H4)的误差比 Amber 力场小得多,这使其成为模拟扩展的蛋白质 - 配体复合物(多达 10,000 个原子)的有效工具。