Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
J Chem Inf Model. 2010 Nov 22;50(11):1941-8. doi: 10.1021/ci100305g. Epub 2010 Nov 4.
Constructing a highly predictive model and exploiting the underlying mechanism associated with a specific property of chemicals are the two main goals of quantitative structure-activity relationship analysis (QSAR). However, the latter has long been carried out as a byproduct of model construction. Here we confirmed for the first time in this study that conventional descriptor selection methods designed to develop a best predictive model are likely not suitable for mechanistic analysis, i.e., the selected descriptors strongly depended on the selection of chemicals in the training sets. As an alternative, a consensus ranking protocol was proposed to select a robust descriptor set for mechanistic analysis, which can successfully overcome the above shortcoming. Moreover, the consistently inferior model performance using descriptors selected for mechanistic analysis suggested the irreplaceable role of model development in achieving models with the best predictive capability.
构建高度可预测的模型并揭示与化学物质特定性质相关的潜在机制是定量构效关系分析(QSAR)的两个主要目标。然而,长期以来,后者一直是作为模型构建的副产品进行的。在这里,我们首次在这项研究中证实,用于开发最佳预测模型的传统描述符选择方法可能不适合于机制分析,即选择的描述符强烈依赖于训练集中化学物质的选择。作为替代方法,我们提出了一种共识排序协议来选择用于机制分析的稳健描述符集,可以成功克服上述缺点。此外,使用用于机制分析选择的描述符的模型性能始终较差,这表明模型开发在获得具有最佳预测能力的模型方面不可替代。