• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

交叉验证在选择和评估回归与分类模型时的陷阱。

Cross-validation pitfalls when selecting and assessing regression and classification models.

机构信息

Research Centre for Cheminformatics, Jasenova 7, 11030, Beograd, Serbia.

Laboratory for Molecular Biomedicine, Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Vojvode Stepe 444a, 11010, Beograd, Serbia.

出版信息

J Cheminform. 2014 Mar 29;6(1):10. doi: 10.1186/1758-2946-6-10.

DOI:10.1186/1758-2946-6-10
PMID:24678909
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3994246/
Abstract

BACKGROUND

We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches.

METHODS

We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case.

RESULTS

We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models.

CONCLUSIONS

We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.

摘要

背景

我们解决了使用交叉验证选择和评估分类和回归模型的问题。目前最先进的方法可能会产生方差较大的模型,使得它们不适合许多实际应用,包括 QSAR。在本文中,我们描述并评估了提高所选模型可靠性和信心的最佳实践。所提出方法的一个关键操作组件是云计算,它使以前不可行的方法得以常规使用。

方法

我们详细描述了一种用于分类和回归的参数调整的重复网格搜索 V 折交叉验证算法,并且我们定义了一种用于模型评估的重复嵌套交叉验证算法。关于变量选择和参数调整,我们定义了两种算法(重复网格搜索交叉验证和双交叉验证),并为在一般情况下使用重复网格搜索提供了论据。

结果

我们在七个 QSAR 数据集上展示了我们算法的结果。在选择和评估分类和回归模型时,需要考虑在 V 折交叉验证中选择不同数据集划分的预测性能的变化。

结论

我们证明了当选择最优模型时重复交叉验证的重要性,以及当评估预测误差时重复嵌套交叉验证的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/cf91d1be8845/13321_2014_Article_587_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/c094dfc93490/13321_2014_Article_587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/c62b2426963e/13321_2014_Article_587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/22a83b4e01fd/13321_2014_Article_587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/5fa42d340f48/13321_2014_Article_587_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/0576f4d78e4c/13321_2014_Article_587_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/480aa40cec37/13321_2014_Article_587_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/32e08e978af2/13321_2014_Article_587_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/f51bb0b81f32/13321_2014_Article_587_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/11474470a829/13321_2014_Article_587_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/4a2d2a04fb09/13321_2014_Article_587_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/14a9f1dc4b28/13321_2014_Article_587_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/d98e3b61bbe4/13321_2014_Article_587_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/a6c358394702/13321_2014_Article_587_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/1da9ffce3ac5/13321_2014_Article_587_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/1c39aca3debf/13321_2014_Article_587_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/cf91d1be8845/13321_2014_Article_587_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/c094dfc93490/13321_2014_Article_587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/c62b2426963e/13321_2014_Article_587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/22a83b4e01fd/13321_2014_Article_587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/5fa42d340f48/13321_2014_Article_587_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/0576f4d78e4c/13321_2014_Article_587_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/480aa40cec37/13321_2014_Article_587_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/32e08e978af2/13321_2014_Article_587_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/f51bb0b81f32/13321_2014_Article_587_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/11474470a829/13321_2014_Article_587_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/4a2d2a04fb09/13321_2014_Article_587_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/14a9f1dc4b28/13321_2014_Article_587_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/d98e3b61bbe4/13321_2014_Article_587_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/a6c358394702/13321_2014_Article_587_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/1da9ffce3ac5/13321_2014_Article_587_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/1c39aca3debf/13321_2014_Article_587_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbca/3994246/cf91d1be8845/13321_2014_Article_587_Fig16_HTML.jpg

相似文献

1
Cross-validation pitfalls when selecting and assessing regression and classification models.交叉验证在选择和评估回归与分类模型时的陷阱。
J Cheminform. 2014 Mar 29;6(1):10. doi: 10.1186/1758-2946-6-10.
2
Channel selection and classification of electroencephalogram signals: an artificial neural network and genetic algorithm-based approach.脑电信号的通道选择与分类:基于人工神经网络和遗传算法的方法。
Artif Intell Med. 2012 Jun;55(2):117-26. doi: 10.1016/j.artmed.2012.02.001. Epub 2012 Apr 12.
3
Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation.在模型不确定性下使用双重交叉验证对定量构效关系(QSAR)模型的预测误差进行可靠估计。
J Cheminform. 2014 Nov 26;6(1):47. doi: 10.1186/s13321-014-0047-1. eCollection 2014.
4
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.使用带贝叶斯正则化的稀疏逻辑回归进行癌症分类中的基因选择。
Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14.
5
Bias in error estimation when using cross-validation for model selection.在使用交叉验证进行模型选择时误差估计中的偏差。
BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.
6
GSNFS: Gene subnetwork biomarker identification of lung cancer expression data.GSNFS:肺癌表达数据的基因子网生物标志物识别
BMC Med Genomics. 2016 Dec 5;9(Suppl 3):70. doi: 10.1186/s12920-016-0231-4.
7
Genomic-enabled prediction with classification algorithms.使用分类算法的基因组预测
Heredity (Edinb). 2014 Jun;112(6):616-26. doi: 10.1038/hdy.2013.144. Epub 2014 Jan 15.
8
Validation of differential gene expression algorithms: application comparing fold-change estimation to hypothesis testing.差异基因表达算法的验证:应用比较折叠变化估计与假设检验。
BMC Bioinformatics. 2010 Jan 28;11:63. doi: 10.1186/1471-2105-11-63.
9
Reviewing ensemble classification methods in breast cancer.综述乳腺癌中的集成分类方法。
Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20.
10
Beware of External Validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.谨防外部验证!——QSAR建模中几种验证技术的比较研究。
Curr Comput Aided Drug Des. 2018;14(4):284-291. doi: 10.2174/1573409914666180426144304.

引用本文的文献

1
Decoding HIV Discourse on Social Media: Large-Scale Analysis of 191,972 Tweets Using Machine Learning, Topic Modeling, and Temporal Analysis.解码社交媒体上关于艾滋病病毒的话语:使用机器学习、主题建模和时间分析对191,972条推文进行大规模分析
J Med Internet Res. 2025 Aug 29;27:e76745. doi: 10.2196/76745.
2
Enhancing the Analysis of Rheological Behavior in Clinker-Aided Cementitious Systems Through Large Language Model-Based Synthetic Data Generation.通过基于大语言模型的合成数据生成增强熟料辅助胶凝体系流变行为分析
Materials (Basel). 2025 Jul 30;18(15):3579. doi: 10.3390/ma18153579.
3
Statistical variability in comparing accuracy of neuroimaging based classification models via cross validation.

本文引用的文献

1
The Use of Rule-Based and QSPR Approaches in ADME Profiling: A Case Study on Caco-2 Permeability.基于规则和定量构效关系方法在药物吸收、分布、代谢和排泄特性分析中的应用:以Caco-2细胞通透性为例
Mol Inform. 2013 Jun;32(5-6):459-79. doi: 10.1002/minf.201200166. Epub 2013 May 15.
2
Correcting the optimal resampling-based error rate by estimating the error rate of wrapper algorithms.通过估计包装算法的错误率来校正基于最优重采样的错误率。
Biometrics. 2013 Sep;69(3):693-702. doi: 10.1111/biom.12041. Epub 2013 Jul 11.
3
Modeling phospholipidosis induction: reliability and warnings.
通过交叉验证比较基于神经影像学的分类模型准确性时的统计变异性。
Sci Rep. 2025 Aug 6;15(1):28745. doi: 10.1038/s41598-025-12026-2.
4
A comparative analysis of emotion recognition from EEG signals using temporal features and hyperparameter-tuned machine learning techniques.使用时间特征和超参数调整的机器学习技术对脑电图信号进行情感识别的比较分析。
MethodsX. 2025 Jun 25;15:103468. doi: 10.1016/j.mex.2025.103468. eCollection 2025 Dec.
5
Multi-modal analyses of proteomic measurements associated with type 2 diabetes from the Project Baseline Health Study.来自基线健康项目研究的与2型糖尿病相关的蛋白质组学测量的多模态分析。
Commun Med (Lond). 2025 Jul 3;5(1):272. doi: 10.1038/s43856-025-00964-x.
6
Evaluation of the Effect of Using Different Types of Clinker Grinding Aids on Grinding Performance by Numerical Analysis.通过数值分析评估不同类型的熟料粉磨助磨剂对粉磨性能的影响
Materials (Basel). 2025 Jun 9;18(12):2712. doi: 10.3390/ma18122712.
7
The Development of an Ultrasound-Based Scoring System for the Prediction of Interstitial Pregnancy.一种基于超声的评分系统用于预测输卵管间质部妊娠的研究进展
J Clin Med. 2025 Jun 14;14(12):4238. doi: 10.3390/jcm14124238.
8
Prognostic model for predicting recurrence in breast cancer patients in Saudi Arabia.沙特阿拉伯乳腺癌患者复发预测的预后模型。
Sci Rep. 2025 May 26;15(1):18388. doi: 10.1038/s41598-025-94530-z.
9
Prescriptive Predictors of Mindfulness Ecological Momentary Intervention for Social Anxiety Disorder: Machine Learning Analysis of Randomized Controlled Trial Data.社交焦虑障碍正念生态瞬时干预的规范性预测因素:随机对照试验数据的机器学习分析
JMIR Ment Health. 2025 May 13;12:e67210. doi: 10.2196/67210.
10
Adapting Generative Large Language Models for Information Extraction from Unstructured Electronic Health Records in Residential Aged Care: A Comparative Analysis of Training Approaches.使生成式大语言模型适用于从老年护理机构的非结构化电子健康记录中提取信息:训练方法的比较分析
J Healthc Inform Res. 2025 Feb 20;9(2):191-219. doi: 10.1007/s41666-025-00190-z. eCollection 2025 Jun.
建模磷脂蓄积诱导:可靠性和警示。
J Chem Inf Model. 2013 Jun 24;53(6):1436-46. doi: 10.1021/ci400113t. Epub 2013 Jun 5.
4
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.
5
Bias in error estimation when using cross-validation for model selection.在使用交叉验证进行模型选择时误差估计中的偏差。
BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.
6
General melting point prediction based on a diverse compound data set and artificial neural networks.基于多样化合物数据集和人工神经网络的一般熔点预测
J Chem Inf Model. 2005 May-Jun;45(3):581-90. doi: 10.1021/ci0500132.
7
Assessing the reliability of a QSAR model's predictions.评估定量构效关系(QSAR)模型预测的可靠性。
J Mol Graph Model. 2005 Jun;23(6):503-23. doi: 10.1016/j.jmgm.2005.03.003.
8
Derivation and validation of toxicophores for mutagenicity prediction.用于致突变性预测的毒性基团的推导与验证
J Med Chem. 2005 Jan 13;48(1):312-20. doi: 10.1021/jm040835a.
9
A mathematical model for prediction of drug molecule diffusion across the blood-brain barrier.一种预测药物分子跨血脑屏障扩散的数学模型。
Can J Neurol Sci. 2004 Nov;31(4):520-7. doi: 10.1017/s0317167100003759.
10
Selection bias in gene extraction on the basis of microarray gene-expression data.基于微阵列基因表达数据进行基因提取时的选择偏倚。
Proc Natl Acad Sci U S A. 2002 May 14;99(10):6562-6. doi: 10.1073/pnas.102102699. Epub 2002 Apr 30.