The David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
J Cheminform. 2009 Apr 28;1:4. doi: 10.1186/1758-2946-1-4.
The inverse-QSAR problem seeks to find a new molecular descriptor from which one can recover the structure of a molecule that possess a desired activity or property. Surprisingly, there are very few papers providing solutions to this problem. It is a difficult problem because the molecular descriptors involved with the inverse-QSAR algorithm must adequately address the forward QSAR problem for a given biological activity if the subsequent recovery phase is to be meaningful. In addition, one should be able to construct a feasible molecule from such a descriptor. The difficulty of recovering the molecule from its descriptor is the major limitation of most inverse-QSAR methods.
In this paper, we describe the reversibility of our previously reported descriptor, the vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our inverse-QSAR approach can be described using five steps: (1) generate the VSMMD for the compounds in the training set; (2) map the VSMMD in the input space to the kernel feature space using an appropriate kernel function; (3) design or generate a new point in the kernel feature space using a kernel feature space algorithm; (4) map the feature space point back to the input space of descriptors using a pre-image approximation algorithm; (5) build the molecular structure template using our VSMMD molecule recovery algorithm.
The empirical results reported in this paper show that our strategy of using kernel methodology for an inverse-Quantitative Structure-Activity Relationship is sufficiently powerful to find a meaningful solution for practical problems.
逆定量构效关系(Inverse-QSAR)问题旨在从新的分子描述符中恢复具有所需活性或性质的分子结构。令人惊讶的是,很少有论文提供该问题的解决方案。这是一个困难的问题,因为逆 QSAR 算法中涉及的分子描述符必须充分解决给定生物活性的正向 QSAR 问题,以便后续的恢复阶段具有意义。此外,人们应该能够从这样的描述符构建可行的分子。从描述符中恢复分子的困难是大多数逆 QSAR 方法的主要限制。
在本文中,我们描述了我们之前报道的描述符的可逆性,该描述符是基于向量空间模型的向量空间模型分子描述符(VSMMD),适用于 QSAR 建模中的核研究。我们的逆 QSAR 方法可以用五个步骤来描述:(1)生成训练集中化合物的 VSMMD;(2)使用适当的核函数将 VSMMD 映射到核特征空间;(3)使用核特征空间算法在核特征空间中设计或生成新点;(4)使用预图像逼近算法将特征空间点映射回描述符的输入空间;(5)使用我们的 VSMMD 分子恢复算法构建分子结构模板。
本文报告的经验结果表明,我们使用核方法进行逆定量构效关系的策略具有足够的能力,可以为实际问题找到有意义的解决方案。