Department of Pathology and Immunology, School of Medicine, Washington University in St. Louis, Saint Louis, Missouri 63110, United States.
J Phys Chem A. 2020 Nov 5;124(44):9194-9202. doi: 10.1021/acs.jpca.0c06231. Epub 2020 Oct 21.
Atom- or bond-level chemical properties of interest in medicinal chemistry, such as drug metabolism and electrophilic reactivity, are important to understand and predict across arbitrary new molecules. Deep learning can be used to map molecular structures to their chemical properties, but the data sets for these tasks are relatively small, which can limit accuracy and generalizability. To overcome this limitation, it would be preferable to model these properties on the basis of the underlying quantum chemical characteristics of small molecules. However, it is difficult to learn higher level chemical properties from lower level quantum calculations. To overcome this challenge, we pretrained deep learning models to compute quantum chemical properties and then reused the intermediate representations constructed by the pretrained network. Transfer learning, in this way, substantially outperformed models based on chemical graphs alone or quantum chemical properties alone. This result was robust, observable in five prediction tasks: identifying sites of epoxidation by metabolic enzymes and identifying sites of covalent reactivity with cyanide, glutathione, DNA and protein. We see that this approach may substantially improve the accuracy of deep learning models for specific chemical structures, such as aromatic systems.
在药物化学中,原子或键级的化学性质(如药物代谢和亲电反应性)对于理解和预测任意新分子都很重要。深度学习可用于将分子结构映射到其化学性质,但这些任务的数据集中相对较小,这可能会限制准确性和通用性。为了克服这一限制,最好根据小分子的基本量子化学特性来模拟这些性质。然而,从较低级别的量子计算中学习更高级别的化学性质是很困难的。为了克服这一挑战,我们对深度学习模型进行了预训练,以计算量子化学性质,然后重复使用预训练网络构建的中间表示。通过这种方式,迁移学习大大优于仅基于化学图或量子化学性质的模型。这一结果是稳健的,在五个预测任务中都可以观察到:代谢酶引发环氧化反应的位置和与氰化物、谷胱甘肽、DNA 和蛋白质发生共价反应的位置。我们发现,这种方法可能会大大提高深度学习模型对特定化学结构(如芳香系统)的准确性。