定量构效关系（QSAR）模型的一致性：训练集和测试集的正确划分、模型排名及性能参数

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

作者信息

Rácz A, Bajusz D, Héberger K

机构信息

a Plasma Chemistry Research Group , Hungarian Academy of Sciences , Budapest , Hungary.

b Department of Applied Chemistry , Corvinus University of Budapest , Budapest , Hungary.

出版信息

SAR QSAR Environ Res. 2015;26(7-9):683-700. doi: 10.1080/1062936X.2015.1084647. Epub 2015 Oct 5.

DOI:10.1080/1062936X.2015.1084647

PMID:26434574

Abstract

Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.

摘要

近期定量构效关系（QSAR）建模软件的实现为用户提供了众多模型和丰富信息。在本研究中，我们就如何解释QSAR建模结果、比较和评估所得模型以及选择最佳且最一致的模型提供了一些指导。应用两个QSAR数据集作为案例研究，以比较模型性能参数和模型选择方法。我们展示了排名差异总和（SRD）在模型选择和排名中的能力，并确定了最佳性能指标和模型。虽然原始训练集和（外部）测试集的交换不会影响性能参数的排名，但在某些情况下它能提供改进的模型（尽管训练集中的分子数量较少）。在SRD分析中，外部验证的性能参数与其他优点显著分开，突出了它们在数据融合中的价值。

相似文献

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

SAR QSAR Environ Res. 2015;26(7-9):683-700. doi: 10.1080/1062936X.2015.1084647. Epub 2015 Oct 5.

Sum of ranking differences (SRD) to ensemble multivariate calibration model merits for tuning parameter selection and comparing calibration methods.

Anal Chim Acta. 2015 Apr 15;869:21-33. doi: 10.1016/j.aca.2014.12.056. Epub 2015 Feb 7.

Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR.

Mol Inform. 2019 Aug;38(8-9):e1800154. doi: 10.1002/minf.201800154. Epub 2019 Apr 4.

Monte Carlo method based QSAR modeling of maleimide derivatives as glycogen synthase kinase-3β inhibitors.

Comput Biol Med. 2015 Sep;64:276-82. doi: 10.1016/j.compbiomed.2015.07.004. Epub 2015 Jul 16.

Does rational selection of training and test sets improve the outcome of QSAR modeling?

J Chem Inf Model. 2012 Oct 22;52(10):2570-8. doi: 10.1021/ci300338w. Epub 2012 Oct 3.

Beware of External Validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

Curr Comput Aided Drug Des. 2018;14(4):284-291. doi: 10.2174/1573409914666180426144304.

Modelling methods and cross-validation variants in QSAR: a multi-level analysis.

SAR QSAR Environ Res. 2018 Sep;29(9):661-674. doi: 10.1080/1062936X.2018.1505778. Epub 2018 Aug 30.

Combinatorial QSAR of ambergris fragrance compounds.

J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):582-95. doi: 10.1021/ci034203t.

A comparative QSAR study of benzamidines complement-inhibitory activity and benzene derivatives acute toxicity.

Comput Chem. 2000 Mar;24(2):181-91. doi: 10.1016/s0097-8485(99)00059-5.

The proposal of architecture for chemical splitting to optimize QSAR models for aquatic toxicity.

Chemosphere. 2008 Jun;72(5):772-80. doi: 10.1016/j.chemosphere.2008.03.016. Epub 2008 May 8.

引用本文的文献

Real-Time Acoustic Scene Recognition for Elderly Daily Routines Using Edge-Based Deep Learning.

Sensors (Basel). 2025 Mar 12;25(6):1746. doi: 10.3390/s25061746.

Multilayered screening for multi-targeted anti-Alzheimer's and anti-Parkinson's agents through structure-based pharmacophore modelling, MCDM, docking, molecular dynamics and DFT: a case study of HDAC4 inhibitors.

In Silico Pharmacol. 2025 Jan 21;13(1):16. doi: 10.1007/s40203-024-00302-4. eCollection 2025.

Computer-guided design of novel nitrogen-based heterocyclic sphingosine-1-phosphate (S1P) activators as osteoanabolic agents.

EXCLI J. 2024 May 27;23:818-832. doi: 10.17179/excli2024-7214. eCollection 2024.

CSEL-BGC: A Bioinformatics Framework Integrating Machine Learning for Defining the Biosynthetic Evolutionary Landscape of Uncharacterized Antibacterial Natural Products.

Interdiscip Sci. 2025 Mar;17(1):27-41. doi: 10.1007/s12539-024-00656-5. Epub 2024 Sep 30.

QSAR Study, Molecular Docking and Molecular Dynamic Simulation of Aurora Kinase Inhibitors Derived from Imidazo[4,5-]pyridine Derivatives.

Molecules. 2024 Apr 13;29(8):1772. doi: 10.3390/molecules29081772.

Identification of Coronary Artery Diseases Using Photoplethysmography Signals and Practical Feature Selection Process.

Bioengineering (Basel). 2023 Feb 13;10(2):249. doi: 10.3390/bioengineering10020249.

Comparison of various methods for validity evaluation of QSAR models.

BMC Chem. 2022 Aug 23;16(1):63. doi: 10.1186/s13065-022-00856-4.

Mol Divers. 2023 Aug;27(4):1603-1612. doi: 10.1007/s11030-022-10514-5. Epub 2022 Aug 17.

The Relevance of Goodness-of-fit, Robustness and Prediction Validation Categories of OECD-QSAR Principles with Respect to Sample Size and Model Type.

Mol Inform. 2022 Nov;41(11):e2200072. doi: 10.1002/minf.202200072. Epub 2022 Jul 25.

Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets.

Front Chem. 2022 Jun 8;10:852893. doi: 10.3389/fchem.2022.852893. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

定量构效关系（QSAR）模型的一致性：训练集和测试集的正确划分、模型排名及性能参数

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

作者信息

Rácz A, Bajusz D, Héberger K

机构信息

a Plasma Chemistry Research Group , Hungarian Academy of Sciences , Budapest , Hungary.

b Department of Applied Chemistry , Corvinus University of Budapest , Budapest , Hungary.

出版信息

SAR QSAR Environ Res. 2015;26(7-9):683-700. doi: 10.1080/1062936X.2015.1084647. Epub 2015 Oct 5.

DOI:10.1080/1062936X.2015.1084647

PMID:26434574

Abstract

摘要

定量构效关系（QSAR）模型的一致性：训练集和测试集的正确划分、模型排名及性能参数

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

定量构效关系（QSAR）模型的一致性：训练集和测试集的正确划分、模型排名及性能参数

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

作者信息

机构信息

出版信息

相似文献

引用本文的文献