Zhang Hui, Chen Qing-Yi, Xiang Ming-Li, Ma Chang-Ying, Huang Qi, Yang Sheng-Yong
State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, No. 1, Keyuan Road 4, Gaopeng Street, Chengdu, Sichuan 610041, PR China.
Toxicol In Vitro. 2009 Feb;23(1):134-40. doi: 10.1016/j.tiv.2008.09.017. Epub 2008 Oct 2.
Drug-induced mitochondrial toxicity has become one of the key reasons for which some drugs fail to enter market or are withdrawn from market. Thus early identification of new chemical entities that injure mitochondrial function grows to be very necessary to produce safer drugs and directly reduce attrition rate in later stages of drug development. In this study, support vector machine (SVM) method combined with genetic algorithm (GA) for feature selection and conjugate gradient method (CG) for parameter optimization (GA-CG-SVM), has been employed to develop prediction model of mitochondrial toxicity. We firstly collected 288 compounds, including 171 MT+ and 117 MT-, from different literature resources. Then these compounds were randomly separated into a training set (253 compounds) and a test set (35 compounds). The overall prediction accuracy for the training set by means of 5-fold cross-validation is 84.59%. Further, the SVM model was evaluated by using the independent test set. The overall prediction accuracy for the test set is 77.14%. These clearly indicate that the mitochondrial toxicity is predictable. Meanwhile impacts of the feature selection and SVM parameter optimization on the quality of SVM model were also examined and discussed. The results implicate the potential of the proposed GA-CG-SVM in facilitating the prediction of mitochondrial toxicity.
药物诱导的线粒体毒性已成为一些药物未能进入市场或被撤出市场的关键原因之一。因此,尽早识别出损害线粒体功能的新化学实体对于研发更安全的药物以及直接降低药物研发后期的淘汰率变得非常必要。在本研究中,支持向量机(SVM)方法与用于特征选择的遗传算法(GA)以及用于参数优化的共轭梯度法(CG)相结合(GA-CG-SVM),用于开发线粒体毒性预测模型。我们首先从不同的文献资源中收集了288种化合物,其中包括171种线粒体毒性阳性(MT+)化合物和117种线粒体毒性阴性(MT-)化合物。然后将这些化合物随机分为训练集(253种化合物)和测试集(35种化合物)。通过5折交叉验证,训练集的总体预测准确率为84.59%。此外,使用独立测试集对SVM模型进行评估。测试集的总体预测准确率为77.14%。这些结果清楚地表明线粒体毒性是可预测的。同时,还研究和讨论了特征选择和SVM参数优化对SVM模型质量的影响。结果表明所提出的GA-CG-SVM在促进线粒体毒性预测方面具有潜力。