McNaughton Andrew D, Joshi Rajendra P, Knutson Carter R, Fnu Anubhav, Luebke Kevin J, Malerich Jeremiah P, Madrid Peter B, Kumar Neeraj
Pacific Northwest National Laboratory, Richland, Washington 99354, United States.
SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States.
J Chem Inf Model. 2023 Mar 13;63(5):1462-1471. doi: 10.1021/acs.jcim.2c01662. Epub 2023 Feb 27.
Accurate understanding of ultraviolet-visible (UV-vis) spectra is critical for the high-throughput synthesis of compounds for drug discovery. Experimentally determining UV-vis spectra can become expensive when dealing with a large quantity of novel compounds. This provides us an opportunity to drive computational advances in molecular property predictions using quantum mechanics and machine learning methods. In this work, we use both quantum mechanically (QM) predicted and experimentally measured UV-vis spectra as input to devise four different machine learning architectures, UVvis-SchNet, UVvis-DTNN, UVvis-Transformer, and UVvis-MPNN, and assess the performance of each method. We find that the UVvis-MPNN model outperforms the other models when using optimized 3D coordinates and QM predicted spectra as input features. This model has the highest performance for predicting UV-vis spectra with a training RMSE of 0.06 and validation RMSE of 0.08. Most importantly, our model can be used for the challenging task of predicting differences in the UV-vis spectral signatures of regioisomers.
准确理解紫外可见(UV-vis)光谱对于高通量合成用于药物发现的化合物至关重要。在处理大量新型化合物时,通过实验测定UV-vis光谱可能会变得很昂贵。这为我们提供了一个机会,以推动使用量子力学和机器学习方法在分子性质预测方面的计算进展。在这项工作中,我们将量子力学(QM)预测的和实验测量的UV-vis光谱都用作输入,设计了四种不同的机器学习架构,即UVvis-SchNet、UVvis-DTNN、UVvis-Transformer和UVvis-MPNN,并评估了每种方法的性能。我们发现,当使用优化的三维坐标和QM预测的光谱作为输入特征时,UVvis-MPNN模型的性能优于其他模型。该模型在预测UV-vis光谱方面具有最高性能,训练均方根误差(RMSE)为0.06,验证RMSE为0.08。最重要的是,我们的模型可用于预测区域异构体UV-vis光谱特征差异这一具有挑战性的任务。