Anstine Dylan M, Zubatyuk Roman, Isayev Olexandr
Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh Pennsylvania 15213 USA
Chem Sci. 2025 Apr 29. doi: 10.1039/d4sc08572h.
Machine learned interatomic potentials (MLIPs) are reshaping computational chemistry practices because of their ability to drastically exceed the accuracy-length/time scale tradeoff. Despite this attraction, the benefits of such efficiency are only impactful when an MLIP uniquely enables insight into a target system or is broadly transferable outside of the training dataset. In this work, we present the 2 generation of our atoms-in-molecules neural network potential (AIMNet2), which is applicable to species composed of up to 14 chemical elements in both neutral and charged states, making it a valuable method for modeling the majority of non-metallic compounds. Using an exhaustive dataset of 2 × 10 hybrid DFT level of theory quantum chemical calculations, AIMNet2 combines ML-parameterized short-range and physics-based long-range terms to attain generalizability that reaches from simple organics to diverse molecules with "exotic" element-organic bonding. We show that AIMNet2 outperforms semi-empirical GFN2-xTB and is on par with reference density functional theory for interaction energy contributions, conformer search tasks, torsion rotation profiles, and molecular-to-macromolecular geometry optimization. Overall, the demonstrated chemical coverage and computational efficiency of AIMNet2 is a significant step toward providing access to MLIPs that avoid the crucial limitation of curating additional quantum chemical data and retraining with each new application.
机器学习原子间势(MLIPs)正在重塑计算化学实践,因为它们能够极大地超越精度-长度/时间尺度的权衡。尽管有这种吸引力,但只有当MLIP能够独特地洞察目标系统或在训练数据集之外具有广泛的可转移性时,这种效率带来的好处才会产生影响。在这项工作中,我们展示了我们的分子中原子神经网络势的第二代(AIMNet2),它适用于由多达14种化学元素组成的处于中性和带电状态的物种,使其成为模拟大多数非金属化合物的有价值方法。利用一个包含2×10个混合密度泛函理论(DFT)水平量子化学计算的详尽数据集,AIMNet2结合了基于机器学习参数化的短程项和基于物理的长程项,以实现从简单有机物到具有“奇异”元素-有机键的各种分子的通用性。我们表明,AIMNet2在相互作用能贡献、构象搜索任务、扭转旋转分布以及分子到大分子的几何优化方面优于半经验的GFN2-xTB,并且与参考密度泛函理论相当。总体而言,AIMNet2所展示的化学覆盖范围和计算效率是朝着提供可避免整理额外量子化学数据并随着每个新应用重新训练这一关键限制的MLIP迈出的重要一步。