Department of Chemistry, Bar-Ilan University, Ramat-Gan 5290002, Israel.
Int J Mol Sci. 2020 Oct 22;21(21):7828. doi: 10.3390/ijms21217828.
Quantitative Structure Activity Relationship (QSAR) models can inform on the correlation between activities and structure-based molecular descriptors. This information is important for the understanding of the factors that govern molecular properties and for designing new compounds with favorable properties. Due to the large number of calculate-able descriptors and consequently, the much larger number of descriptors combinations, the derivation of QSAR models could be treated as an optimization problem. For continuous responses, metrics which are typically being optimized in this process are related to model performances on the training set, for example, R2 and QCV2. Similar metrics, calculated on an external set of data (e.g., QF1/F2/F32), are used to evaluate the performances of the final models. A common theme of these metrics is that they are context -" ignorant". In this work we propose that QSAR models should be evaluated based on their intended usage. More specifically, we argue that QSAR models developed for Virtual Screening (VS) should be derived and evaluated using a virtual screening-aware metric, e.g., an enrichment-based metric. To demonstrate this point, we have developed 21 Multiple Linear Regression (MLR) models for seven targets (three models per target), evaluated them first on validation sets and subsequently tested their performances on two additional test sets constructed to mimic small-scale virtual screening campaigns. As expected, we found no correlation between model performances evaluated by "classical" metrics, e.g., R2 and QF1/F2/F32 and the number of active compounds picked by the models from within a pool of random compounds. In particular, in some cases models with favorable R2 and/or QF1/F2/F32 values were unable to pick a single active compound from within the pool whereas in other cases, models with poor R2 and/or QF1/F2/F32 values performed well in the context of virtual screening. We also found no significant correlation between the number of active compounds correctly identified by the models in the training, validation and test sets. Next, we have developed a new algorithm for the derivation of MLR models by optimizing an enrichment-based metric and tested its performances on the same datasets. We found that the best models derived in this manner showed, in most cases, much more consistent results across the training, validation and test sets and outperformed the corresponding MLR models in most virtual screening tests. Finally, we demonstrated that when tested as binary classifiers, models derived for the same targets by the new algorithm outperformed Random Forest (RF) and Support Vector Machine (SVM)-based models across training/validation/test sets, in most cases. We attribute the better performances of the Enrichment Optimizer Algorithm (EOA) models in VS to better handling of inactive random compounds. Optimizing an enrichment-based metric is therefore a promising strategy for the derivation of QSAR models for classification and virtual screening.
定量构效关系(QSAR)模型可以提供关于活性与基于结构的分子描述符之间相关性的信息。这些信息对于理解控制分子性质的因素以及设计具有有利性质的新化合物非常重要。由于可计算的描述符数量众多,因此描述符组合的数量也大大增加,因此 QSAR 模型的推导可以被视为一个优化问题。对于连续响应,在此过程中通常优化的指标与模型在训练集上的性能有关,例如 R2 和 QCV2。使用外部数据集(例如,QF1/F2/F32)计算的类似指标用于评估最终模型的性能。这些指标的一个共同主题是它们是上下文不可知的。在这项工作中,我们提出 QSAR 模型应该根据其预期用途进行评估。更具体地说,我们认为,为虚拟筛选(VS)开发的 QSAR 模型应该使用虚拟筛选感知指标(例如基于富集的指标)进行推导和评估。为了证明这一点,我们已经为七个目标(每个目标三个模型)开发了 21 个多元线性回归(MLR)模型,首先在验证集上进行评估,然后在两个额外的测试集上测试其性能,这些测试集旨在模拟小规模虚拟筛选活动。不出所料,我们发现模型性能评估的经典指标(例如 R2 和 QF1/F2/F32)与模型从随机化合物库中挑选的活性化合物数量之间没有相关性。特别是,在某些情况下,具有有利 R2 和/或 QF1/F2/F32 值的模型无法从库中挑选出一个活性化合物,而在其他情况下,具有较差 R2 和/或 QF1/F2/F32 值的模型在虚拟筛选的背景下表现良好。我们还发现模型在训练集、验证集和测试集中正确识别的活性化合物数量之间没有显著相关性。接下来,我们开发了一种用于通过优化基于富集的指标来推导 MLR 模型的新算法,并在相同的数据集上测试了其性能。我们发现,以这种方式推导的最佳模型在大多数情况下在训练集、验证集和测试集之间表现出更加一致的结果,并且在大多数虚拟筛选测试中表现优于相应的 MLR 模型。最后,我们证明了当作为二进制分类器进行测试时,通过新算法为同一目标推导的模型在大多数情况下都优于随机森林(RF)和支持向量机(SVM)模型。我们将 EOA 模型在 VS 中的更好性能归因于更好地处理无效的随机化合物。因此,优化基于富集的指标是用于推导分类和虚拟筛选的 QSAR 模型的有前途的策略。