Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA.
J Chem Phys. 2023 Mar 28;158(12):124110. doi: 10.1063/5.0139281.
Modern semiempirical electronic structure methods have considerable promise in drug discovery as universal "force fields" that can reliably model biological and drug-like molecules, including alternative tautomers and protonation states. Herein, we compare the performance of several neglect of diatomic differential overlap-based semiempirical (MNDO/d, AM1, PM6, PM6-D3H4X, PM7, and ODM2), density-functional tight-binding based (DFTB3, DFTB/ChIMES, GFN1-xTB, and GFN2-xTB) models with pure machine learning potentials (ANI-1x and ANI-2x) and hybrid quantum mechanical/machine learning potentials (AIQM1 and QDπ) for a wide range of data computed at a consistent ωB97X/6-31G* level of theory (as in the ANI-1x database). This data includes conformational energies, intermolecular interactions, tautomers, and protonation states. Additional comparisons are made to a set of natural and synthetic nucleic acids from the artificially expanded genetic information system that has important implications for the design of new biotechnology and therapeutics. Finally, we examine the acid/base chemistry relevant for RNA cleavage reactions catalyzed by small nucleolytic ribozymes, DNAzymes, and ribonucleases. Overall, the hybrid quantum mechanical/machine learning potentials appear to be the most robust for these datasets, and the recently developed QDπ model performs exceptionally well, having especially high accuracy for tautomers and protonation states relevant to drug discovery.
现代半经验电子结构方法在药物发现中具有很大的潜力,因为它们可以作为通用的“力场”,可靠地模拟生物和药物样分子,包括替代互变异构体和质子化状态。在这里,我们比较了几种基于忽略双原子微分重叠的半经验(MNDO/d、AM1、PM6、PM6-D3H4X、PM7 和 ODM2)、基于密度泛函紧束缚的(DFTB3、DFTB/ChIMES、GFN1-xTB 和 GFN2-xTB)模型与纯机器学习势(ANI-1x 和 ANI-2x)和混合量子力学/机器学习势(AIQM1 和 QDπ)在广泛的数据范围内的性能,这些数据是在一致的 ωB97X/6-31G*理论水平上计算的(如在 ANI-1x 数据库中)。这些数据包括构象能、分子间相互作用、互变异构体和质子化状态。还与一组来自人为扩展遗传信息系统的天然和合成核酸进行了比较,这对新生物技术和治疗剂的设计具有重要意义。最后,我们研究了与小核酶、DNA 酶和核糖核酸酶催化的 RNA 切割反应相关的酸碱化学。总体而言,混合量子力学/机器学习势似乎对这些数据集最稳健,最近开发的 QDπ 模型表现异常出色,对于与药物发现相关的互变异构体和质子化状态具有特别高的准确性。