Nycomed Chair for Bioinformatics and Information Mining, University of Konstanz, Konstanz, Germany.
J Chem Inf Model. 2011 Feb 28;51(2):237-47. doi: 10.1021/ci100426r. Epub 2011 Feb 10.
Diversity selection is a common task in early drug discovery. One drawback of current approaches is that usually only the structural diversity is taken into account, therefore, activity information is ignored. In this article, we present a modified version of diversity selection, which we term Maximum-Score Diversity Selection, that additionally takes the estimated or predicted activities of the molecules into account. We show that finding an optimal solution to this problem is computationally very expensive (it is NP-hard), and therefore, heuristic approaches are needed. After a discussion of existing approaches, we present our new method, which is computationally far more efficient but at the same time produces comparable results. We conclude by validating these theoretical differences on several data sets.
多样性选择是早期药物发现中的一项常见任务。当前方法的一个缺点是,通常只考虑结构多样性,因此忽略了活性信息。在本文中,我们提出了一种多样性选择的改进版本,我们称之为最大得分多样性选择,它还考虑了分子的估计或预测活性。我们表明,找到这个问题的最优解在计算上非常昂贵(它是 NP 难的),因此需要启发式方法。在讨论了现有方法之后,我们提出了我们的新方法,它在计算上效率更高,但同时产生的结果相当。最后,我们在几个数据集上验证了这些理论差异。