Dandu Naveen K, Ward Logan, Assary Rajeev S, Redfern Paul C, Curtiss Larry A
Materials Science Division, Argonne National Laboratory, Lemont, Illinois 60439, United States.
Joint Center for Energy Storage Research (JCESR), Argonne National Laboratory, Lemont, Illinois 60439, United States.
J Phys Chem A. 2023 Jul 20;127(28):5914-5920. doi: 10.1021/acs.jpca.3c00823. Epub 2023 Jul 5.
In previous work (Dandu et al., . . . , , , 4528-4536), we were successful in predicting accurate atomization energies of organic molecules using machine learning (ML) models, obtaining an accuracy as low as 0.1 kcal/mol compared to the G4MP2 method. In this work, we extend the use of these ML models to adiabatic ionization potentials on data sets of energies generated using quantum chemical calculations. Atomic specific corrections that were found to improve atomization energies from quantum chemical calculations have also been used in this study to improve ionization potentials. The quantum chemical calculations were performed on 3405 molecules containing eight or fewer non-hydrogen atoms derived from the QM9 data set, using the B3LYP functional with the 6-31G(2df,p) basis set for optimization. Low-fidelity IPs for these structures were obtained using two density functional methods: B3LYP/6-31+G(2df,p) and ωB97XD/6-311+G(3df,2p). Highly accurate G4MP2 calculations were performed on these optimized structures to obtain high-fidelity IPs to use in ML models based on the low-fidelity IPs. Our best performing ML methods gave IPs of organic molecules within a mean absolute deviation of 0.035 eV from the G4MP2 IPs for the whole data set. This work demonstrates that ML predictions assisted by quantum chemical calculations can be used to successfully predict IPs of organic molecules for use in high throughput screening.
在之前的工作中(丹杜等人,……,,,4528 - 4536),我们成功地使用机器学习(ML)模型预测了有机分子的精确雾化能,与G4MP2方法相比,获得了低至0.1千卡/摩尔的精度。在这项工作中,我们将这些ML模型的应用扩展到基于量子化学计算生成的能量数据集上的绝热电离势。在本研究中还使用了发现能改善量子化学计算雾化能的原子特定校正来提高电离势。对从QM9数据集中提取的3405个含8个或更少非氢原子的分子进行了量子化学计算,使用B3LYP泛函和6 - 31G(2df,p)基组进行优化。使用两种密度泛函方法获得这些结构的低保真电离势:B3LYP/6 - 31 + G(2df,p)和ωB97XD/6 - 311 + G(3df,2p)。对这些优化结构进行了高精度的G4MP2计算,以获得基于低保真电离势用于ML模型的高保真电离势。我们表现最佳的ML方法给出的有机分子电离势与整个数据集的G4MP2电离势的平均绝对偏差在0.035电子伏特以内。这项工作表明,由量子化学计算辅助的ML预测可用于成功预测有机分子的电离势,以用于高通量筛选。