Wilton David, Willett Peter, Lawson Kevin, Mullier Graham
Krebs Institute for Biomolecular Research and Department of Information Studies, University of Sheffield, Sheffield S10 2TN, UK.
J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):469-74. doi: 10.1021/ci025586i.
This paper discusses the use of several rank-based virtual screening methods for prioritizing compounds in lead-discovery programs, given a training set for which both structural and bioactivity data are available. Structures from the NCI AIDS data set and from the Syngenta corporate database were represented by two types of fragment bit-string and by sets of high-level molecular features. These representations were processed using binary kernel discrimination, similarity searching, substructural analysis, support vector machine, and trend vector analysis, with the effectiveness of the methods being judged by the extent to which active test set molecules were clustered toward the top of the resultant rankings. The binary kernel discrimination approach yielded consistently superior rankings and would appear to have considerable potential for chemical screening applications.
本文讨论了几种基于排序的虚拟筛选方法在先导化合物发现项目中对化合物进行优先级排序的应用,前提是有一个同时具备结构和生物活性数据的训练集。美国国立癌症研究所艾滋病数据集和先正达公司数据库中的结构由两种类型的片段位串和一组高级分子特征表示。这些表示通过二元核判别、相似性搜索、子结构分析、支持向量机和趋势向量分析进行处理,方法的有效性通过活性测试集分子在所得排名中向顶部聚集的程度来判断。二元核判别方法产生的排名始终更优,似乎在化学筛选应用中有很大潜力。