• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将半经验哈密顿量视为灵活的机器学习模型可产生准确且可解释的结果。

Treating Semiempirical Hamiltonians as Flexible Machine Learning Models Yields Accurate and Interpretable Results.

作者信息

Hu Frank, He Francis, Yaron David J

机构信息

Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States.

出版信息

J Chem Theory Comput. 2023 Sep 26;19(18):6185-6196. doi: 10.1021/acs.jctc.3c00491. Epub 2023 Sep 13.

DOI:10.1021/acs.jctc.3c00491
PMID:37705220
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10536991/
Abstract

Quantum chemistry provides chemists with invaluable information, but the high computational cost limits the size and type of systems that can be studied. Machine learning (ML) has emerged as a means to dramatically lower the cost while maintaining high accuracy. However, ML models often sacrifice interpretability by using components such as the artificial neural networks of deep learning that function as black boxes. These components impart the flexibility needed to learn from large volumes of data but make it difficult to gain insight into the physical or chemical basis for the predictions. Here, we demonstrate that semiempirical quantum chemical (SEQC) models can learn from large volumes of data without sacrificing interpretability. The SEQC model is that of density-functional-based tight binding (DFTB) with fixed atomic orbital energies and interactions that are one-dimensional functions of the interatomic distance. This model is trained to data in a manner that is analogous to that used to train deep learning models. Using benchmarks that reflect the accuracy of the training data, we show that the resulting model maintains a physically reasonable functional form while achieving an accuracy, relative to coupled cluster energies with a complete basis set extrapolation (CCSD(T)*/CBS), that is comparable to that of density functional theory (DFT). This suggests that trained SEQC models can achieve a low computational cost and high accuracy without sacrificing interpretability. Use of a physically motivated model form also substantially reduces the amount of data needed to train the model compared to that required for deep learning models.

摘要

量子化学为化学家提供了极有价值的信息,但高昂的计算成本限制了可研究体系的规模和类型。机器学习(ML)已成为一种在保持高精度的同时大幅降低成本的手段。然而,ML模型通常会通过使用深度学习中的人工神经网络等作为黑箱的组件来牺牲可解释性。这些组件赋予了从大量数据中学习所需的灵活性,但却难以深入了解预测的物理或化学基础。在此,我们证明半经验量子化学(SEQC)模型可以在不牺牲可解释性的情况下从大量数据中学习。SEQC模型是基于密度泛函的紧束缚(DFTB)模型,其具有固定的原子轨道能量和作为原子间距离一维函数的相互作用。该模型以类似于训练深度学习模型的方式对数据进行训练。使用反映训练数据准确性的基准,我们表明所得模型在保持物理上合理的函数形式的同时,相对于具有完整基组外推的耦合簇能量(CCSD(T)*/CBS),实现了与密度泛函理论(DFT)相当的准确性。这表明经过训练的SEQC模型可以在不牺牲可解释性的情况下实现低计算成本和高精度。与深度学习模型相比,使用具有物理动机的模型形式还大幅减少了训练模型所需的数据量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/95b99f1645bd/ct3c00491_0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/50ee0bf08f63/ct3c00491_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/3d1bcb857796/ct3c00491_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/bbf7c075ec97/ct3c00491_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/e83a94193b74/ct3c00491_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/33b256d818cb/ct3c00491_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/bc434b56b7b1/ct3c00491_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/5feb53d7ae4e/ct3c00491_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/d532a7b603c8/ct3c00491_0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/95b99f1645bd/ct3c00491_0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/50ee0bf08f63/ct3c00491_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/3d1bcb857796/ct3c00491_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/bbf7c075ec97/ct3c00491_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/e83a94193b74/ct3c00491_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/33b256d818cb/ct3c00491_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/bc434b56b7b1/ct3c00491_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/5feb53d7ae4e/ct3c00491_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/d532a7b603c8/ct3c00491_0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02ff/10536991/95b99f1645bd/ct3c00491_0009.jpg

相似文献

1
Treating Semiempirical Hamiltonians as Flexible Machine Learning Models Yields Accurate and Interpretable Results.将半经验哈密顿量视为灵活的机器学习模型可产生准确且可解释的结果。
J Chem Theory Comput. 2023 Sep 26;19(18):6185-6196. doi: 10.1021/acs.jctc.3c00491. Epub 2023 Sep 13.
2
Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics.深度学习具有半经验量子力学的动态响应化学哈密顿量。
Proc Natl Acad Sci U S A. 2022 Jul 5;119(27):e2120333119. doi: 10.1073/pnas.2120333119. Epub 2022 Jul 1.
3
Calculations on noncovalent interactions and databases of benchmark interaction energies.非共价相互作用的计算和基准相互作用能数据库。
Acc Chem Res. 2012 Apr 17;45(4):663-72. doi: 10.1021/ar200255p. Epub 2012 Jan 6.
4
Accurate Many-Body Repulsive Potentials for Density-Functional Tight Binding from Deep Tensor Neural Networks.基于深度张量神经网络的密度泛函紧束缚精确多体排斥势
J Phys Chem Lett. 2020 Aug 20;11(16):6835-6843. doi: 10.1021/acs.jpclett.0c01307. Epub 2020 Aug 7.
5
Ab Initio Calculations for Molecule-Surface Interactions with Chemical Accuracy.具有化学精度的分子-表面相互作用的从头算计算。
Acc Chem Res. 2019 Dec 17;52(12):3502-3510. doi: 10.1021/acs.accounts.9b00506. Epub 2019 Nov 25.
6
A machine learning correction for DFT non-covalent interactions based on the S22, S66 and X40 benchmark databases.基于S22、S66和X40基准数据库的用于密度泛函理论非共价相互作用的机器学习校正。
J Cheminform. 2016 May 3;8:24. doi: 10.1186/s13321-016-0133-7. eCollection 2016.
7
High-Accuracy Semiempirical Quantum Models Based on a Minimal Training Set.基于最小训练集的高精度半经验量子模型。
J Phys Chem Lett. 2022 Apr 7;13(13):2934-2942. doi: 10.1021/acs.jpclett.2c00453. Epub 2022 Mar 28.
8
Canonical and explicitly-correlated coupled cluster correlation energies of sub-kJ mol accuracy cost-effective hybrid-post-CBS extrapolation.具有亚千焦每摩尔精度的标准且显式相关耦合簇相关能采用了具有成本效益的混合后完全基组外推法。
Phys Chem Chem Phys. 2021 Apr 22;23(15):9571-9584. doi: 10.1039/d1cp00357g.
9
Interpretable machine learning models for hospital readmission prediction: a two-step extracted regression tree approach.可解释的机器学习模型在医院再入院预测中的应用:一种两步提取回归树方法。
BMC Med Inform Decis Mak. 2023 Jun 5;23(1):104. doi: 10.1186/s12911-023-02193-5.
10
Machine Learning Enhanced DFTB Method for Periodic Systems: Learning from Electronic Density of States.机器学习增强的周期性体系 DFTB 方法:从电子态密度中学习。
J Chem Theory Comput. 2023 Jul 11;19(13):3877-3888. doi: 10.1021/acs.jctc.3c00152. Epub 2023 Jun 23.

引用本文的文献

1
Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry.关于人工智能在化学领域潜力的跨学科观点。
Chem Soc Rev. 2025 Apr 25. doi: 10.1039/d5cs00146c.
2
Data Generation for Machine Learning Interatomic Potentials and Beyond.用于机器学习原子间势及其他方面的数据生成。
Chem Rev. 2024 Dec 25;124(24):13681-13714. doi: 10.1021/acs.chemrev.4c00572. Epub 2024 Nov 21.
3
Efficient Parameterization of Density Functional Tight-Binding for 5-Elements: A Th-O Case Study.用于5种元素的密度泛函紧束缚的高效参数化:钍-氧案例研究

本文引用的文献

1
Exploring chemical compound space with quantum-based machine learning.利用基于量子的机器学习探索化合物空间。
Nat Rev Chem. 2020 Jul;4(7):347-358. doi: 10.1038/s41570-020-0189-9. Epub 2020 Jun 12.
2
TBMaLT, a flexible toolkit for combining tight-binding and machine learning.TBMaLT,一个用于结合紧束缚和机器学习的灵活工具包。
J Chem Phys. 2023 Jan 21;158(3):034801. doi: 10.1063/5.0132892.
3
Obtaining Electronic Properties of Molecules through Combining Density Functional Tight Binding with Machine Learning.通过将密度泛函紧束缚与机器学习相结合来获取分子的电子性质。
J Chem Theory Comput. 2024 Jul 23;20(14):5923-5936. doi: 10.1021/acs.jctc.4c00145. Epub 2024 Jul 11.
4
OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials.OpenMM 8:基于机器学习势的分子动力学模拟。
J Phys Chem B. 2024 Jan 11;128(1):109-116. doi: 10.1021/acs.jpcb.3c06662. Epub 2023 Dec 28.
J Phys Chem Lett. 2022 Nov 3;13(43):10132-10139. doi: 10.1021/acs.jpclett.2c02586. Epub 2022 Oct 21.
4
A quantum chemical molecular dynamics repository of solvated ions.溶剂化离子的量子化学分子动力学存储库。
Sci Data. 2022 Jul 21;9(1):430. doi: 10.1038/s41597-022-01527-8.
5
Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics.深度学习具有半经验量子力学的动态响应化学哈密顿量。
Proc Natl Acad Sci U S A. 2022 Jul 5;119(27):e2120333119. doi: 10.1073/pnas.2120333119. Epub 2022 Jul 1.
6
Artificial intelligence-enhanced quantum chemical method with broad applicability.具有广泛适用性的人工智能增强型量子化学方法。
Nat Commun. 2021 Dec 2;12(1):7022. doi: 10.1038/s41467-021-27340-2.
7
OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy.OrbNet Denali:一种具有半经验成本和 DFT 精度的机器学习在生物和有机化学中的应用。
J Chem Phys. 2021 Nov 28;155(20):204103. doi: 10.1063/5.0061990.
8
Deep Learning Coordinate-Free Quantum Chemistry.深度学习无坐标量子化学。
J Phys Chem A. 2021 Oct 14;125(40):8978-8986. doi: 10.1021/acs.jpca.1c04462. Epub 2021 Oct 5.
9
Machine Learning Force Fields: Recent Advances and Remaining Challenges.机器学习力场:最新进展和待解决的挑战。
J Phys Chem Lett. 2021 Jul 22;12(28):6551-6564. doi: 10.1021/acs.jpclett.1c01204. Epub 2021 Jul 9.
10
MLatom 2: An Integrative Platform for Atomistic Machine Learning.MLatom 2:用于原子级机器学习的集成平台。
Top Curr Chem (Cham). 2021 Jun 8;379(4):27. doi: 10.1007/s41061-021-00339-5.