Jeong WooSeok, Stoneburner Samuel J, King Daniel, Li Ruye, Walker Andrew, Lindh Roland, Gagliardi Laura
Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States.
Department of Computer Science and Engineering, University of Minnesota, 200 Union Street Southeast, Minneapolis, Minnesota 55455, United States.
J Chem Theory Comput. 2020 Apr 14;16(4):2389-2399. doi: 10.1021/acs.jctc.9b01297. Epub 2020 Mar 12.
Predicting and understanding the chemical bond is one of the major challenges of computational quantum chemistry. Kohn-Sham density functional theory (KS-DFT) is the most common method, but approximate density functionals may not be able to describe systems where multiple electronic configurations are equally important. Multiconfigurational wave functions, on the other hand, can provide a detailed understanding of the electronic structures and chemical bonds of such systems. In the complete active space self-consistent field (CASSCF) method, one performs a full configuration interaction calculation in an active space consisting of active electrons and active orbitals. However, CASSCF and its variants require the selection of these active spaces. This choice is not black box; it requires significant experience and testing by the user, and thus active space methods are not considered particularly user-friendly and are employed only by a minority of quantum chemists. Our goal is to popularize these methods by making it easier to make good active space choices. We present a machine learning protocol that performs an automated selection of active spaces for chemical bond dissociation calculations of main group diatomic molecules. The protocol shows high prediction performance for a given target system as long as a properly correlated system is chosen for training. Good active spaces are correctly predicted with a considerably better success rate than random guess (larger than 80% precision for most systems studied). Our automated machine learning protocol shows that a "black-box" mode is possible for facilitating and accelerating the large-scale calculations on multireference systems where single-reference methods such as KS-DFT cannot be applied.
预测和理解化学键是计算量子化学的主要挑战之一。科恩-沙姆密度泛函理论(KS-DFT)是最常用的方法,但近似密度泛函可能无法描述多种电子构型同等重要的系统。另一方面,多组态波函数可以提供对此类系统电子结构和化学键的详细理解。在完全活性空间自洽场(CASSCF)方法中,人们在由活性电子和活性轨道组成的活性空间中进行全组态相互作用计算。然而,CASSCF及其变体需要选择这些活性空间。这种选择并非易事;它需要用户具备丰富的经验并进行测试,因此活性空间方法并不被认为特别便于用户使用,只有少数量子化学家使用。我们的目标是通过使做出良好的活性空间选择变得更容易来推广这些方法。我们提出了一种机器学习协议,用于自动选择主族双原子分子化学键解离计算的活性空间。只要为训练选择了适当相关的系统,该协议对于给定的目标系统就显示出很高的预测性能。与随机猜测相比,能够以相当高的成功率正确预测出良好的活性空间(对于大多数研究的系统,精度大于80%)。我们的自动化机器学习协议表明,对于无法应用KS-DFT等单参考方法的多参考系统,“黑箱”模式有助于并加速大规模计算是可行的。