School of Pharmaceutical Sciences & School of Data and Computer Science , Sun Yat-Sen University , 132 East Circle at University City , Guangzhou 510006 , China.
School of Computer Science & Technology , Wuyi University , 99 Yingbin Road , Jiangmen 529020 , China.
J Chem Inf Model. 2019 Mar 25;59(3):1044-1049. doi: 10.1021/acs.jcim.8b00672. Epub 2019 Feb 21.
In the drug discovery process, unstable compounds in storage can lead to false positive or false negative bioassay conclusions. Prediction of the chemical stability of a compound by de novo methods is complex. Chemical instability prediction is commonly based on a model derived from empirical data. The COMDECOM (COMpound DECOMposition) project provides the empirical data for prediction of chemical stability. Models such as the extended-connectivity fingerprint and atom center fragments were built from the COMDECOM data and used for estimation of chemical stability, but deficits in the existing models remain. In this paper, we report DeepChemStable, a model employing an attention-based graph convolution network based on the COMDECOM data. The main advantage of this method is that DeepChemStable is an end-to-end model, which does not predefine structural fingerprint features, but instead, dynamically learns structural features and associates the features through the learning process of an attention-based graph convolution network. The previous ChemStable program relied on a rule-based method to reduce the false negatives. DeepChemStable, on the other hand, reduces the risk of false negatives without using a rule-based method. Because minimizing the rate of false negatives is a greater concern for instability prediction, this feature is a major improvement. This model achieves an AUC value of 84.7%, recall rate of 79.8%, and 10-fold stratified cross-validation accuracy of 79.1%.
在药物发现过程中,储存中不稳定的化合物可能导致生物测定结论出现假阳性或假阴性。通过从头方法预测化合物的化学稳定性较为复杂。化学不稳定性预测通常基于从经验数据得出的模型。COMDECOM(化合物分解)项目提供了用于预测化学稳定性的经验数据。从 COMDECOM 数据中构建了扩展连接指纹和原子中心片段等模型,用于估计化学稳定性,但现有模型仍存在缺陷。在本文中,我们报告了 DeepChemStable,这是一种基于 COMDECOM 数据的基于注意力的图卷积网络的模型。该方法的主要优势在于,DeepChemStable 是一个端到端的模型,它不预先定义结构指纹特征,而是通过基于注意力的图卷积网络的学习过程动态学习结构特征并关联这些特征。之前的 ChemStable 程序依赖于基于规则的方法来减少假阴性。而 DeepChemStable 则无需使用基于规则的方法来降低假阴性的风险。由于最小化假阴性率是不稳定性预测更为关注的问题,因此该功能是一个重大改进。该模型的 AUC 值为 84.7%,召回率为 79.8%,10 倍分层交叉验证准确率为 79.1%。