交叉验证失败：样本量小导致误差幅度大。

Cross-validation failure: Small sample sizes lead to large error bars.

机构信息

Parietal Project-team, INRIA Saclay-île de France, France; CEA/Neurospin bât 145, 91191 Gif-Sur-Yvette, France; Université Paris-Saclay, Saclay, France.

出版信息

Neuroimage. 2018 Oct 15;180(Pt A):68-77. doi: 10.1016/j.neuroimage.2017.06.061. Epub 2017 Jun 24.

DOI:10.1016/j.neuroimage.2017.06.061

PMID:28655633

Abstract

Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is cross-validation, testing prediction on unseen data. Here, I would like to raise awareness on error bars of cross-validation, which are often underestimated. Simple experiments show that sample sizes of many neuroimaging studies inherently lead to large error bars, eg±10% for 100 samples. The standard error across folds strongly underestimates them. These large error bars compromise the reliability of conclusions drawn with predictive models, such as biomarkers or methods developments where, unlike with cognitive neuroimaging MVPA approaches, more samples cannot be acquired by repeating the experiment across many subjects. Solutions to increase sample size must be investigated, tackling possible increases in heterogeneity of the data.

摘要

预测模型是统计大脑图像分析领域诸多最新进展的基础，包括解码、MVPA、搜索光和生物标志物提取。验证其有效性和实用性的原则方法是交叉验证，即通过未知数据进行预测测试。在这里，我想提醒大家注意交叉验证的误差幅度，它往往被低估了。简单的实验表明，许多神经影像学研究的样本量本身就会导致较大的误差幅度，例如 100 个样本的误差幅度为±10%。折叠间的标准误差大大低估了这些误差幅度。这些较大的误差幅度会影响使用预测模型得出的结论的可靠性，例如生物标志物或方法开发，与认知神经影像学 MVPA 方法不同，不能通过在许多受试者中重复实验来获得更多的样本。必须研究增加样本量的解决方案，以解决数据异质性可能增加的问题。

相似文献

Cross-validation failure: Small sample sizes lead to large error bars.交叉验证失败：样本量小导致误差幅度大。

Neuroimage. 2018 Oct 15;180(Pt A):68-77. doi: 10.1016/j.neuroimage.2017.06.061. Epub 2017 Jun 24.

Statistical inference and multiple testing correction in classification-based multi-voxel pattern analysis (MVPA): random permutations and cluster size control.基于分类的多体素模式分析 (MVPA) 中的统计推断和多重检验校正：随机置换和聚类大小控制。

Neuroimage. 2013 Jan 15;65:69-82. doi: 10.1016/j.neuroimage.2012.09.063. Epub 2012 Oct 4.

FReM - Scalable and stable decoding with fast regularized ensemble of models.FReM - 使用快速正则化模型集合进行可扩展且稳定的解码。

Neuroimage. 2018 Oct 15;180(Pt A):160-172. doi: 10.1016/j.neuroimage.2017.10.005. Epub 2017 Oct 10.

Decoding fMRI activity in the time domain improves classification performance.在时域中解码 fMRI 活动可提高分类性能。

Neuroimage. 2018 Oct 15;180(Pt A):203-210. doi: 10.1016/j.neuroimage.2017.08.018. Epub 2017 Aug 9.

The significance of streamlined postprocessing approaches for clinical FMRI.简化的临床功能磁共振成像后处理方法的意义。

AJNR Am J Neuroradiol. 2013 Jun-Jul;34(6):1194-6. doi: 10.3174/ajnr.A3446. Epub 2013 Jan 4.

A method for generating reproducible evidence in fMRI studies.一种在功能磁共振成像（fMRI）研究中生成可重复证据的方法。

Neuroimage. 2006 Jan 15;29(2):383-95. doi: 10.1016/j.neuroimage.2005.08.015. Epub 2005 Oct 14.

Reproducibility of importance extraction methods in neural network based fMRI classification.基于神经网络的 fMRI 分类中重要性提取方法的可重复性。

Neuroimage. 2018 Nov 1;181:44-54. doi: 10.1016/j.neuroimage.2018.06.076. Epub 2018 Jun 30.

Cross-validation and permutations in MVPA: Validity of permutation strategies and power of cross-validation schemes.多变量模式分析中的交叉验证与排列：排列策略的有效性及交叉验证方案的效能

Neuroimage. 2021 Sep;238:118145. doi: 10.1016/j.neuroimage.2021.118145. Epub 2021 May 4.

Fast bootstrapping and permutation testing for assessing reproducibility and interpretability of multivariate fMRI decoding models.用于评估多变量功能磁共振成像解码模型的可重复性和可解释性的快速自举法和置换检验

PLoS One. 2013 Nov 14;8(11):e79271. doi: 10.1371/journal.pone.0079271. eCollection 2013.

Searchlight analysis: promise, pitfalls, and potential.探照灯分析：承诺、陷阱和潜力。

Neuroimage. 2013 Sep;78:261-9. doi: 10.1016/j.neuroimage.2013.03.041. Epub 2013 Apr 1.

引用本文的文献

High-precision machine learning identifies a reproducible functional connectivity signature of autism spectrum diagnosis in a subset of individuals.高精度机器学习在一部分个体中识别出了自闭症谱系诊断的可重复功能连接特征。

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf091.

Lifespan Tree of Brain Anatomy: Diagnostic Values for Motor and Cognitive Neurodegenerative Diseases.脑解剖学的寿命树：运动和认知神经退行性疾病的诊断价值。

Hum Brain Mapp. 2025 Sep;46(13):e70336. doi: 10.1002/hbm.70336.

Prenatal cannabis exposure, the brain, and psychopathology during early adolescence.产前大麻暴露、大脑与青春期早期的精神病理学

Nat Ment Health. 2024 Aug;2(8):975-986. doi: 10.1038/s44220-024-00281-7. Epub 2024 Jul 4.

Challenges in multi-task learning for fMRI-based diagnosis: Benefits for psychiatric conditions and CNVs would likely require thousands of patients.基于功能磁共振成像（fMRI）的诊断在多任务学习中的挑战：对精神疾病和拷贝数变异（CNV）的诊断可能需要数千名患者。

Imaging Neurosci (Camb). 2024 Jul 26;2. doi: 10.1162/imag_a_00222. eCollection 2024.

From brain to education through machine learning: Predicting literacy and numeracy skills from neuroimaging data.从大脑到教育：通过机器学习，利用神经影像数据预测读写和计算能力

Imaging Neurosci (Camb). 2024 Jul 3;2. doi: 10.1162/imag_a_00219. eCollection 2024.

Statistical variability in comparing accuracy of neuroimaging based classification models via cross validation.通过交叉验证比较基于神经影像学的分类模型准确性时的统计变异性。

Sci Rep. 2025 Aug 6;15(1):28745. doi: 10.1038/s41598-025-12026-2.

Refining the generation, interpretation and application of multi-organ, multi-omics biological aging clocks.优化多器官、多组学生物衰老时钟的生成、解读及应用。

Nat Aging. 2025 Aug 5. doi: 10.1038/s43587-025-00928-9.

Cross-Dataset Evaluation of Dementia Longitudinal Progression Prediction Models.痴呆纵向进展预测模型的跨数据集评估

Hum Brain Mapp. 2025 Aug 1;46(11):e70280. doi: 10.1002/hbm.70280.

Longer scans boost prediction and cut costs in brain-wide association studies.更长时间的扫描可提高全脑关联研究中的预测能力并降低成本。

Nature. 2025 Jul 16. doi: 10.1038/s41586-025-09250-1.

A machine learning pipeline for efficient differentiation between bipolar and major depressive disorder based on multimodal structural neuroimaging.一种基于多模态结构神经成像的用于有效区分双相情感障碍和重度抑郁症的机器学习流程。

Neurosci Appl. 2023 Dec 22;3:103931. doi: 10.1016/j.nsa.2023.103931. eCollection 2024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

交叉验证失败：样本量小导致误差幅度大。

Cross-validation failure: Small sample sizes lead to large error bars.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献