Guha Rajarshi
School of Informatics, Indiana University, Bloomington, IN 47408, USA.
J Comput Aided Mol Des. 2008 Dec;22(12):857-71. doi: 10.1007/s10822-008-9240-5. Epub 2008 Sep 11.
The goal of a quantitative structure-activity relationship (QSAR) model is to encode the relationship between molecular structure and biological activity or physical property. Based on this encoding, such models can be used for predictive purposes. Assuming the use of relevant and meaningful descriptors, and a statistically significant model, extraction of the encoded structure-activity relationships (SARs) can provide insight into what makes a molecule active or inactive. Such analyses by QSAR models are useful in a number of scenarios, such as suggesting structural modifications to enhance activity, explanation of outliers and exploratory analysis of novel SARs. In this paper we discuss the need for interpretation and an overview of the factors that affect interpretability of QSAR models. We then describe interpretation protocols for different types of models, highlighting the different types of interpretations, ranging from very broad, global, trends to very specific, case-by-case, descriptions of the SAR, using examples from the training set. Finally, we discuss a number of case studies where workers have provide some form of interpretation of a QSAR model.
定量构效关系(QSAR)模型的目标是对分子结构与生物活性或物理性质之间的关系进行编码。基于这种编码,此类模型可用于预测目的。假设使用相关且有意义的描述符以及具有统计学意义的模型,提取编码的构效关系(SAR)能够深入了解使分子具有活性或无活性的因素。QSAR模型的此类分析在许多情况下都很有用,例如建议进行结构修饰以增强活性、解释异常值以及对新型SAR进行探索性分析。在本文中,我们讨论了解释的必要性以及影响QSAR模型可解释性的因素概述。然后,我们描述了不同类型模型的解释协议,突出了不同类型的解释,从非常宽泛的全局趋势到非常具体的逐个案例的SAR描述,并使用训练集中的示例进行说明。最后,我们讨论了一些案例研究,其中研究人员对QSAR模型进行了某种形式的解释。