Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, Moscow, Russia.
SAR QSAR Environ Res. 2009 Oct;20(7-8):679-709. doi: 10.1080/10629360903438370.
In the existing quantitative structure-activity relationship (QSAR) methods any molecule is represented as a single point in a many-dimensional space of molecular descriptors. We propose a new QSAR approach based on Quantitative Neighbourhoods of Atoms (QNA) descriptors, which characterize each atom of a molecule and depend on the whole molecule structure. In the 'Star Track' methodology any molecule is represented as a set of points in a two-dimensional space of QNA descriptors. With our new method the estimate of the target property of a chemical compound is calculated as the average value of the function of QNA descriptors in the points of the atoms of a molecule in QNA descriptor space. Substantially, we propose the use of only two descriptors rather than more than 3000 molecular descriptors that apply in the QSAR method. On the basis of this approach we have developed the computer program GUSAR and compared it with several widely used QSAR methods including CoMFA, CoMSIA, Golpe/GRID, HQSAR and others, using ten data sets representing various chemical series and diverse types of biological activity. We show that in the majority of cases the accuracy and predictivity of GUSAR models appears to be better than those for the reference QSAR methods. High predictive ability and robustness of GUSAR are also shown in the leave-20%-out cross-validation procedure.
在现有的定量构效关系 (QSAR) 方法中,任何分子都表示为分子描述符多维空间中的一个单点。我们提出了一种基于定量原子邻域 (QNA) 描述符的新 QSAR 方法,该方法描述符用于描述分子的每个原子,并取决于整个分子结构。在“Star Track”方法中,任何分子都表示为 QNA 描述符二维空间中的一组点。在我们的新方法中,化合物目标性质的估计值是通过在 QNA 描述符空间中分子的原子点的 QNA 描述符函数的平均值来计算的。实际上,我们建议仅使用两个描述符,而不是在 QSAR 方法中应用的 3000 多个分子描述符。在此基础上,我们开发了计算机程序 GUSAR,并将其与 CoMFA、CoMSIA、Golpe/GRID、HQSAR 等几种广泛使用的 QSAR 方法进行了比较,使用了十个代表各种化学系列和不同类型生物活性的数据集。我们表明,在大多数情况下,GUSAR 模型的准确性和预测能力似乎优于参考 QSAR 方法。在 20%的交叉验证过程中,GUSAR 也显示出了高的预测能力和稳健性。