Modeling and Informatics , Merck & Co. Inc. , Kenilworth , New Jersey 07065 , United States.
J Chem Inf Model. 2019 Apr 22;59(4):1324-1337. doi: 10.1021/acs.jcim.8b00825. Epub 2019 Mar 4.
Most chemists would agree that the ability to interpret a quantitative structure-activity relationship (QSAR) model is as important as the ability of the model to make accurate predictions. One type of interpretation is coloration of atoms in molecules according to the contribution of each atom to the predicted activity, as in "heat maps". The ability to determine which parts of a molecule increase the activity in question and which decrease it should be useful to chemists who want to modify the molecule. For that type of application, we would hope the coloration to not be particularly sensitive to the details of model building. In this Article, we examine a number of aspects of coloration against 20 combinations of descriptors and QSAR methods. We demonstrate that atom-level coloration is much less robust to descriptor/method combinations than cross-validated predictions. Even in ideal cases where the contribution of individual atoms is known, we cannot always recover the important atoms for some descriptor/method combinations. Thus, model interpretation by atom coloration may not be as simple as it first appeared.
大多数化学家都认为,解读定量构效关系(QSAR)模型的能力与模型进行准确预测的能力同样重要。一种解释方法是根据每个原子对预测活性的贡献对分子中的原子进行着色,就像“热图”一样。确定分子的哪些部分增加了活性,哪些部分降低了活性,对于希望修饰分子的化学家来说应该是有用的。对于那种类型的应用,我们希望着色对模型构建的细节不那么敏感。在本文中,我们针对 20 种描述符和 QSAR 方法组合,检查了着色的多个方面。我们证明,原子级别的着色对描述符/方法组合的稳健性远低于交叉验证预测。即使在个别原子贡献已知的理想情况下,对于某些描述符/方法组合,我们也不能总是恢复重要的原子。因此,通过原子着色进行模型解释可能并不像最初看起来那么简单。