Lee Myeonghun, Min Kyoungmin
School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea.
School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea.
ACS Omega. 2022 Jan 14;7(4):3649-3655. doi: 10.1021/acsomega.1c06274. eCollection 2022 Feb 1.
The prediction and evaluation of the biodegradability of molecules with computational methods are becoming increasingly important. Among the various methods, quantitative structure-activity relationship (QSAR) models have been demonstrated to predict the ready biodegradation of chemicals but have limited functionality owing to their complex implementation. In this study, we employ the graph convolutional network (GCN) method to overcome these issues. A biodegradability dataset from previous studies was trained to generate prediction models by (i) the QSAR models using the Mordred molecular descriptor calculator and MACCS molecular fingerprint and (ii) the GCN model using molecular graphs. The performance comparison of the methods confirms that the GCN model is more straightforward to implement and more stable; the specificity and sensitivity values are almost identical without specific descriptors or fingerprints. In addition, the performance of the models was further verified by randomly dividing the dataset into 100 different cases of training and test sets and by varying the test set ratio from 20 to 80%. The results of the current study clearly suggest the promise of the GCN model, which can be implemented straightforwardly and can replace conventional QSAR prediction models for various types and properties of molecules.
利用计算方法对分子的生物降解性进行预测和评估正变得越来越重要。在各种方法中,定量构效关系(QSAR)模型已被证明可用于预测化学品的易生物降解性,但由于其实施过程复杂,功能有限。在本研究中,我们采用图卷积网络(GCN)方法来克服这些问题。通过以下方式对先前研究中的生物降解性数据集进行训练以生成预测模型:(i)使用Mordred分子描述符计算器和MACCS分子指纹的QSAR模型,以及(ii)使用分子图的GCN模型。方法的性能比较证实,GCN模型更易于实施且更稳定;在没有特定描述符或指纹的情况下,特异性和敏感性值几乎相同。此外,通过将数据集随机分为训练集和测试集的100种不同情况,并将测试集比例从20% 变化到80%,进一步验证了模型的性能。当前研究结果清楚地表明了GCN模型的前景,它可以直接实施,并可替代针对各种类型和性质分子的传统QSAR预测模型。