Lin H H, Han L Y, Yap C W, Xue Y, Liu X H, Zhu F, Chen Y Z
Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543, Singapore.
J Mol Graph Model. 2007 Sep;26(2):505-18. doi: 10.1016/j.jmgm.2007.03.003. Epub 2007 Mar 12.
Factor Xa (FXa) inhibitors have been explored as anticoagulants for treatment and prevention of thrombotic diseases. Molecular docking, pharmacophore, quantitative structure-activity relationships, and support vector machines (SVM) have been used for computer prediction of FXa inhibitors. These methods achieve promising prediction accuracies of 69-80% for FXa inhibitors and 85-99% for non-inhibitors. Prediction performance, particularly for inhibitors, may be further improved by exploring methods applicable to more diverse range of compounds and by using more appropriate set of molecular descriptors. We tested the capability of several machine learning methods (C4.5 decision tree, k-nearest neighbor, probabilistic neural network, and support vector machine) by using a much more diverse set of 1098 compounds (360 inhibitors and 738 non-inhibitors) than those in other studies. A feature selection method was used for selecting molecular descriptors appropriate for distinguishing FXa inhibitors and non-inhibitors. The prediction accuracies of these methods are 89.1-97.5% for FXa inhibitors and 92.3-98.1% for non-inhibitors. In particular, compared to other studies, support vector machine gives a substantially improved accuracy of 94.6% for FXa non-inhibitors and maintains a comparable accuracy of 98.1% for inhibitors, based-on a more rigorous test with more diverse range of compounds. Our study suggests that machine learning methods such as SVM are useful for facilitating the prediction of FXa inhibitors.
凝血因子Xa(FXa)抑制剂已被作为治疗和预防血栓性疾病的抗凝剂进行研究。分子对接、药效团、定量构效关系以及支持向量机(SVM)已被用于计算机预测FXa抑制剂。这些方法对FXa抑制剂的预测准确率达到了69 - 80%,对非抑制剂的预测准确率达到了85 - 99%。通过探索适用于更多样化化合物范围的方法以及使用更合适的分子描述符集,预测性能,尤其是对抑制剂的预测性能,可能会进一步提高。我们使用了一组比其他研究中更多样化的1098种化合物(360种抑制剂和738种非抑制剂)来测试几种机器学习方法(C4.5决策树、k近邻、概率神经网络和支持向量机)的能力。一种特征选择方法被用于选择适合区分FXa抑制剂和非抑制剂的分子描述符。这些方法对FXa抑制剂的预测准确率为89.1 - 97.5%,对非抑制剂的预测准确率为92.3 - 98.1%。特别是,与其他研究相比,基于对更多样化化合物范围进行的更严格测试,支持向量机对FXa非抑制剂的准确率大幅提高至94.6%,对抑制剂的准确率保持在98.1%。我们的研究表明,诸如支持向量机之类的机器学习方法有助于促进FXa抑制剂的预测。