J Chem Theory Comput. 2018 Nov 13;14(11):5764-5776. doi: 10.1021/acs.jctc.8b00873. Epub 2018 Nov 5.
Current neural networks for predictions of molecular properties use quantum chemistry only as a source of training data. This paper explores models that use quantum chemistry as an integral part of the prediction process. This is done by implementing self-consistent-charge Density-Functional-Tight-Binding (DFTB) theory as a layer for use in deep learning models. The DFTB layer takes, as input, Hamiltonian matrix elements generated from earlier layers and produces, as output, electronic properties from self-consistent field solutions of the corresponding DFTB Hamiltonian. Backpropagation enables efficient training of the model to target electronic properties. Two types of input to the DFTB layer are explored, splines and feed-forward neural networks. Because overfitting can cause models trained on smaller molecules to perform poorly on larger molecules, regularizations are applied that penalize nonmonotonic behavior and deviation of the Hamiltonian matrix elements from those of the published DFTB model used to initialize the model. The approach is evaluated on 15 700 hydrocarbons by comparing the root-mean-square error in energy and dipole moment, on test molecules with eight heavy atoms, to the error from the initial DFTB model. When trained on molecules with up to seven heavy atoms, the spline model reduces the test error in energy by 60% and in dipole moments by 42%. The neural network model performs somewhat better, with error reductions of 67% and 59%, respectively. Training on molecules with up to four heavy atoms reduces performance, with both the spline and neural net models reducing the test error in energy by about 53% and in dipole by about 25%.
目前用于预测分子性质的神经网络仅将量子化学用作训练数据的来源。本文探索了将量子化学用作预测过程的一个组成部分的模型。这是通过将自洽电荷密度泛函紧束缚(DFTB)理论实现为深度学习模型中的一个层来完成的。DFTB 层将哈密顿矩阵元作为输入,从早期层生成,并从相应的 DFTB 哈密顿的自洽场解中产生电子性质作为输出。反向传播使模型能够有效地针对电子性质进行训练。探索了两种类型的输入到 DFTB 层,样条和前馈神经网络。因为过度拟合可能导致在较小分子上训练的模型在较大分子上表现不佳,所以应用了正则化,以惩罚非单调行为和哈密顿矩阵元与用于初始化模型的发布的 DFTB 模型的矩阵元的偏差。该方法通过比较具有八个重原子的测试分子的能量和偶极矩的均方根误差与初始 DFTB 模型的误差,在 15700 个碳氢化合物上进行了评估。当在具有多达七个重原子的分子上进行训练时,样条模型将能量测试误差降低了 60%,偶极矩测试误差降低了 42%。神经网络模型的性能稍好,能量测试误差分别降低了 67%和 59%,偶极矩测试误差分别降低了 67%和 59%。在具有多达四个重原子的分子上进行训练会降低性能,样条和神经网络模型都将能量测试误差降低了约 53%,偶极矩测试误差降低了约 25%。