Division of Pharmaceutical Technology, Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50 4056, Basel, Switzerland.
J Chem Inf Model. 2011 Oct 24;51(10):2690-6. doi: 10.1021/ci200186m. Epub 2011 Sep 14.
Chemical fingerprints encode the presence or absence of molecular features and are available in many large databases. Using a variation of the Ant Colony Optimization (ACO) paradigm, we describe a binary classifier based on feature selection from fingerprints. We discuss the algorithm and possible cross-validation procedures. As a real-world example, we use our algorithm to analyze a Plasmodium falciparum inhibition assay and contrast its performance with other machine learning paradigms in use today (decision tree induction, random forests, support vector machines, artificial neural networks). Our algorithm matches established paradigms in predictive power, yet supplies the medicinal chemist and basic researcher with easily interpretable results. Furthermore, models generated with our paradigm are easy to implement and can complement virtual screenings by additionally exploiting the precalculated fingerprint information.
化学指纹编码了分子特征的存在或缺失,并且在许多大型数据库中都有提供。我们使用蚁群优化(ACO)范例的变体,描述了一种基于指纹特征选择的二进制分类器。我们讨论了算法和可能的交叉验证过程。作为一个实际的例子,我们使用我们的算法来分析恶性疟原虫抑制测定,并将其性能与当今使用的其他机器学习范例(决策树归纳、随机森林、支持向量机、人工神经网络)进行对比。我们的算法在预测能力上与已确立的范例相匹配,但为药物化学家提供了易于解释的结果和基本研究人员。此外,使用我们的范例生成的模型易于实现,并可以通过额外利用预先计算的指纹信息来补充虚拟筛选。