Suppr超能文献

非共价相互作用的计算和基准相互作用能数据库。

Calculations on noncovalent interactions and databases of benchmark interaction energies.

机构信息

Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, 166 10 Prague, Czech Republic.

出版信息

Acc Chem Res. 2012 Apr 17;45(4):663-72. doi: 10.1021/ar200255p. Epub 2012 Jan 6.

Abstract

Although covalent interactions determine the primary structure of a molecule, the noncovalent interactions are responsible for the tertiary and quaternary structure of a molecule and create the fascinating world of the 3D architectures of biomacromolecules. For example, the double helical structure of DNA is of fundamental importance for the function of DNA: it allows it to store and transfer genetic information. To fulfill this role, the structure is rigid to maintain the double helix with a proper positioning of the complementary base, and floppy to allow for its opening. Very strong covalent interactions cannot fulfill both of these criteria, but noncovalent interactions, which are about 2 orders of magnitude weaker, can. This Account highlights the recent advances in the field of the design of novel wave function theory (WFT) methods applicable to noncovalent complexes ranging in size from less than 100 atoms, for which highly accurate ab initio methods are available, up to extended ones (several thousands atoms), which are the domain of semiempirical QM (SQM) methods. Accurate interaction energies for noncovalent complexes are generated by the coupled-cluster technique, taking single- and double-electron excitations iteratively and triple-electron excitation perturbatively with a complete basis set description (CCSD(T)/CBS). The procedure provides interaction energies with high accuracy (error less than 1 kcal/mol). Because the method is computationally demanding, its application is limited to complexes smaller than 30 atoms. But researchers would also like to use computational methods to determine these interaction energies accurately for larger biological and nanoscale structures. Standard QM methods such as MP2, MP3, CCSD, or DFT fail to describe various types of noncovalent systems (H-bonded, stacked, dispersion-controlled, etc.) with comparable accuracy. Therefore, novel methods are needed that have been parametrized toward noncovalent interactions, and existing benchmark data sets represent an important tool for the development of new methods providing reliable characteristics of noncovalent clusters. Our laboratory developed the first suitable data set of CCSD(T)/CBS interaction energies and geometries of various noncovalent complexes, called S22. Since its publication in 2006, it has frequently been applied in parametrization and/or verification of various wave function and density functional techniques. During the intense use of this data set, several inconsistencies emerged, such as the insufficient accuracy of the CCSD(T) correction term or its unbalanced character, which has triggered the introduction of a new, broader, and more accurate data set called the S66 data set. It contains not only 66 CCSD(T)/CBS interaction energies determined in the equilibrium geometries but also 1056 interaction energies calculated at the same level for nonequilibrium geometries. The S22 and S66 data sets have been used for the verification of various WFT methods, and the lowest RMSE (S66, in kcal/mol) was found for the recently introduced SCS-MI-CCSD/CBS (0.08), MP2.5/CBS (0.16), MP2.X/6-31G* (0.27), and SCS-MI-MP2/CBS (0.38) methods. Because of their computational economy, the MP2.5 and MP2.X/6-31G* methods can be recommended for highly accurate calculations of large complexes with up to 100 atoms. The evaluation of SQM methods was based only on the S22 data set, and because some of these methods have been parametrized toward the same data set, the respective results should be taken with caution. For really extended complexes such as protein-ligand systems, only the SMQ methods are applicable. After adding the corrections to the dispersion energy and H-bonding, several methods exhibit surprisingly low RMSE (even below 0.5 kcal/mol). Among the various SMQ methods, the PM6-DH2 can be recommended because of its computational efficiency and it can be used for optimization (which is not the case for other SQM methods). The PM6-DH2 is the base of our novel scoring function used in in silico drug design.

摘要

虽然共价相互作用决定了分子的一级结构,但非共价相互作用负责分子的三级和四级结构,并创造了生物大分子 3D 结构的迷人世界。例如,DNA 的双螺旋结构对 DNA 的功能至关重要:它允许 DNA 存储和传递遗传信息。为了发挥这个作用,结构必须保持刚性以保持适当定位的互补碱基的双螺旋,同时又必须保持柔性以允许其打开。非常强的共价相互作用不能同时满足这两个标准,但非共价相互作用(大约弱 2 个数量级)可以。本账户重点介绍了设计新型波函数理论(WFT)方法的最新进展,这些方法适用于尺寸从不到 100 个原子的非共价复合物,对于这些复合物,可以使用高度精确的从头算方法,到扩展的复合物(数千个原子),这是半经验 QM(SQM)方法的领域。通过耦合簇技术生成非共价复合物的精确相互作用能,迭代地考虑单电子和双电子激发,并以完全基组描述(CCSD(T)/CBS)进行三重电子激发的微扰。该程序提供了高精度的相互作用能(误差小于 1 kcal/mol)。由于该方法计算量大,因此其应用仅限于小于 30 个原子的复合物。但是,研究人员也希望使用计算方法来准确地确定较大生物和纳米结构的这些相互作用能。标准的量子力学方法,如 MP2、MP3、CCSD 或 DFT,无法以可比的精度描述各种类型的非共价系统(氢键、堆积、色散控制等)。因此,需要开发新的方法,这些方法已经针对非共价相互作用进行了参数化,并且现有的基准数据集是开发提供非共价簇可靠特性的新方法的重要工具。我们的实验室开发了第一个合适的 CCSD(T)/CBS 相互作用能和各种非共价复合物的几何数据集,称为 S22。自 2006 年发表以来,它经常被应用于各种波函数和密度泛函技术的参数化和/或验证。在这个数据集的激烈使用过程中,出现了几个不一致的地方,例如 CCSD(T)校正项的精度不足或其不平衡特性,这促使引入了一个新的、更广泛和更准确的数据集,称为 S66 数据集。它不仅包含 66 个在平衡几何形状下确定的 CCSD(T)/CBS 相互作用能,还包含 1056 个在非平衡几何形状下计算的相互作用能。S22 和 S66 数据集已被用于验证各种 WFT 方法,最近引入的 SCS-MI-CCSD/CBS(0.08)、MP2.5/CBS(0.16)、MP2.X/6-31G*(0.27)和 SCS-MI-MP2/CBS(0.38)方法的最低 RMSE(S66,以千卡/摩尔为单位)。由于其计算经济性,MP2.5 和 MP2.X/6-31G*方法可用于高达 100 个原子的大型复合物的高精度计算。SQM 方法的评估仅基于 S22 数据集,由于其中一些方法已经针对同一数据集进行了参数化,因此各自的结果应谨慎使用。对于像蛋白质 - 配体系统这样真正扩展的复合物,只能应用 SMQ 方法。在添加对色散能和氢键的修正后,几种方法显示出令人惊讶的低 RMSE(甚至低于 0.5 kcal/mol)。在各种 SMQ 方法中,由于其计算效率,可以推荐使用 PM6-DH2,因为它可以用于优化(其他 SQM 方法则不行)。PM6-DH2 是我们用于计算机药物设计的新型评分函数的基础。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验