Zhu Qiang, Wu Yongxian, Zhao Shiji, Cieplak Piotr, Duan Yong, Luo Ray
Department of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering, Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, Irvine, California 92697, United States.
Nurix Therapeutics, Inc., 1700 Owens St, Suite 205, San Francisco, California 94158, United States.
J Chem Theory Comput. 2023 Sep 26;19(18):6353-6365. doi: 10.1021/acs.jctc.3c00659. Epub 2023 Sep 7.
Accurate characterization of electrostatic interactions is crucial in molecular simulation. Various methods and programs have been developed to obtain electrostatic parameters for additive or polarizable models to replicate electrostatic properties obtained from experimental measurements or theoretical calculations. Electrostatic potentials (ESPs), a set of physically well-defined observables from quantum mechanical (QM) calculations, are well suited for optimization efforts due to the ease of collecting a large amount of conformation-dependent data. However, a reliable set of QM ESP computed at an appropriate level of theory and atomic basis set is necessary. In addition, despite the recent development of the PyRESP program for electrostatic parameterizations of induced dipole-polarizable models, the time-consuming and error-prone input file preparation process has limited the widespread use of these protocols. This work aims to comprehensively evaluate the quality of QM ESPs derived by eight methods, including wave function methods such as Hartree-Fock (HF), second-order Møller-Plesset (MP2), and coupled cluster-singles and doubles (CCSD), as well as five hybrid density functional theory (DFT) methods, used in conjunction with 13 different basis sets. The highest theory levels CCSD/aug-cc-pV5Z (a5z) and MP2/aug-cc-pV5Z (a5z) were selected as benchmark data over two homemade data sets. The results show that the hybrid DFT method, ωB97X-D, combined with the aug-cc-pVTZ (a3z) basis set, performs well in reproducing ESPs while taking both accuracy and efficiency into consideration. Moreover, a flexible and user-friendly program called PyRESP_GEN was developed to streamline input file preparation. The restraining strengths, along with strategies for polarizable Gaussian multipole (pGM) model parameterizations, were also optimized. These findings and the program presented in this work facilitate the development and application of induced dipole-polarizable models, such as pGM models, for molecular simulations of both chemical and biological significance.
在分子模拟中,准确表征静电相互作用至关重要。人们已开发出各种方法和程序来获取用于加性或可极化模型的静电参数,以复制从实验测量或理论计算中获得的静电性质。静电势(ESPs)是一组通过量子力学(QM)计算得到的物理定义明确的可观测量,由于易于收集大量依赖于构象的数据,非常适合用于优化工作。然而,需要在适当的理论水平和原子基组下计算出可靠的一组QM ESP。此外,尽管最近开发了用于诱导偶极 - 可极化模型静电参数化的PyRESP程序,但耗时且容易出错的输入文件准备过程限制了这些协议的广泛使用。这项工作旨在全面评估由八种方法推导的QM ESP的质量,这些方法包括波函数方法,如Hartree - Fock(HF)、二阶Møller - Plesset(MP2)和耦合簇单双激发(CCSD),以及五种杂化密度泛函理论(DFT)方法,并结合13种不同的基组。在两个自制数据集上,选择最高理论水平的CCSD/aug - cc - pV5Z(a5z)和MP2/aug - cc - pV5Z(a5z)作为基准数据。结果表明,杂化DFT方法ωB97X - D与aug - cc - pVTZ(a3z)基组相结合,在考虑准确性和效率的同时,在重现ESP方面表现良好。此外,还开发了一个灵活且用户友好的程序PyRESP_GEN来简化输入文件的准备。还优化了约束强度以及可极化高斯多极(pGM)模型参数化的策略。这些发现以及本文中介绍的程序有助于开发和应用诱导偶极 - 可极化模型,如pGM模型,用于具有化学和生物学意义的分子模拟。