Ramakrishnan Raghunathan, Hartmann Mia, Tapavicza Enrico, von Lilienfeld O Anatole
Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials, Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.
Department of Chemistry and Biochemistry, California State University, 1250 Bellflower Boulevard, Long Beach, California 90840, USA.
J Chem Phys. 2015 Aug 28;143(8):084111. doi: 10.1063/1.4928757.
Due to its favorable computational efficiency, time-dependent (TD) density functional theory (DFT) enables the prediction of electronic spectra in a high-throughput manner across chemical space. Its predictions, however, can be quite inaccurate. We resolve this issue with machine learning models trained on deviations of reference second-order approximate coupled-cluster (CC2) singles and doubles spectra from TDDFT counterparts, or even from DFT gap. We applied this approach to low-lying singlet-singlet vertical electronic spectra of over 20 000 synthetically feasible small organic molecules with up to eight CONF atoms. The prediction errors decay monotonously as a function of training set size. For a training set of 10 000 molecules, CC2 excitation energies can be reproduced to within ±0.1 eV for the remaining molecules. Analysis of our spectral database via chromophore counting suggests that even higher accuracies can be achieved. Based on the evidence collected, we discuss open challenges associated with data-driven modeling of high-lying spectra and transition intensities.
由于其良好的计算效率,含时(TD)密度泛函理论(DFT)能够以高通量方式预测化学空间中的电子光谱。然而,其预测可能相当不准确。我们通过在参考二阶近似耦合簇(CC2)单重态和双重态光谱与TDDFT对应物甚至DFT能隙的偏差上训练的机器学习模型来解决这个问题。我们将这种方法应用于超过20000个具有多达8个CONF原子的合成可行的小有机分子的低能单重态 - 单重态垂直电子光谱。预测误差作为训练集大小的函数单调衰减。对于10000个分子的训练集,其余分子的CC2激发能可以在±0.1 eV内再现。通过发色团计数对我们的光谱数据库进行分析表明可以实现更高的精度。基于收集到的证据,我们讨论了与高能光谱和跃迁强度的数据驱动建模相关的开放挑战。