Guha Rajarshi, Van Drie John H
School of Informatics, Indiana University, Bloomington, IN 47406, USA.
J Chem Inf Model. 2008 Aug;48(8):1716-28. doi: 10.1021/ci8001414. Epub 2008 Aug 8.
We introduce the notion of structure-activity landscape index (SALI) curves as a way to assess a model and a modeling protocol, applied to structure-activity relationships. We start from our earlier work [ J. Chem. Inf. Model., 2008, 48, 646-658], where we show how to study a structure-activity relationship pairwise, based on the notion of "activity cliffs"--pairs of molecules that are structurally similar but have large differences in activity. There, we also introduced the SALI parameter, which allows one to identify cliffs easily, and which allows one to represent a structure-activity relationship as a graph. This graph orders every pair of molecules by their activity. Here, we introduce the new idea of a SALI curve, which tallies how many of these orderings a model is able to predict. Empirically, testing these SALI curves against a variety of models, ranging over two-dimensional quantitative structure-activity relationship (2D-QSAR), three-dimensional quantitative structure-activity relationship (3D-QSAR), and structure-based design models, the utility of a model seems to correspond to characteristics of these curves. In particular, the integral of these curves, denoted as SCI and being a number ranging from -1.0 to 1.0, approaches a value of 1.0 for two literature models, which are both known to be prospectively useful.
我们引入了构效景观指数(SALI)曲线的概念,作为评估应用于构效关系的模型和建模协议的一种方法。我们从早期的工作[《化学信息与建模杂志》,2008年,48卷,646 - 658页]出发,在该工作中我们展示了如何基于“活性悬崖”的概念成对地研究构效关系,“活性悬崖”指的是结构相似但活性差异很大的分子对。在那里,我们还引入了SALI参数,它能让人轻松识别悬崖,并能将构效关系表示为一个图。这个图根据分子对的活性对每一对分子进行排序。在此,我们引入了SALI曲线的新想法,它统计模型能够预测的这些排序的数量。从经验上看,针对各种模型测试这些SALI曲线,这些模型涵盖二维定量构效关系(2D - QSAR)、三维定量构效关系(3D - QSAR)以及基于结构的设计模型,模型的效用似乎与这些曲线的特征相对应。特别地,这些曲线的积分,记为SCI,其值在 - 1.0到1.0之间,对于两个文献模型,该积分接近1.0,这两个模型都已知在预测方面是有用的。