Rácz A, Bajusz D, Héberger K
a Plasma Chemistry Research Group , Hungarian Academy of Sciences , Budapest , Hungary.
b Department of Applied Chemistry , Corvinus University of Budapest , Budapest , Hungary.
SAR QSAR Environ Res. 2015;26(7-9):683-700. doi: 10.1080/1062936X.2015.1084647. Epub 2015 Oct 5.
Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.
近期定量构效关系(QSAR)建模软件的实现为用户提供了众多模型和丰富信息。在本研究中,我们就如何解释QSAR建模结果、比较和评估所得模型以及选择最佳且最一致的模型提供了一些指导。应用两个QSAR数据集作为案例研究,以比较模型性能参数和模型选择方法。我们展示了排名差异总和(SRD)在模型选择和排名中的能力,并确定了最佳性能指标和模型。虽然原始训练集和(外部)测试集的交换不会影响性能参数的排名,但在某些情况下它能提供改进的模型(尽管训练集中的分子数量较少)。在SRD分析中,外部验证的性能参数与其他优点显著分开,突出了它们在数据融合中的价值。