Suppr超能文献

交叉验证失败:样本量小导致误差幅度大。

Cross-validation failure: Small sample sizes lead to large error bars.

机构信息

Parietal Project-team, INRIA Saclay-île de France, France; CEA/Neurospin bât 145, 91191 Gif-Sur-Yvette, France; Université Paris-Saclay, Saclay, France.

出版信息

Neuroimage. 2018 Oct 15;180(Pt A):68-77. doi: 10.1016/j.neuroimage.2017.06.061. Epub 2017 Jun 24.

Abstract

Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is cross-validation, testing prediction on unseen data. Here, I would like to raise awareness on error bars of cross-validation, which are often underestimated. Simple experiments show that sample sizes of many neuroimaging studies inherently lead to large error bars, eg±10% for 100 samples. The standard error across folds strongly underestimates them. These large error bars compromise the reliability of conclusions drawn with predictive models, such as biomarkers or methods developments where, unlike with cognitive neuroimaging MVPA approaches, more samples cannot be acquired by repeating the experiment across many subjects. Solutions to increase sample size must be investigated, tackling possible increases in heterogeneity of the data.

摘要

预测模型是统计大脑图像分析领域诸多最新进展的基础,包括解码、MVPA、搜索光和生物标志物提取。验证其有效性和实用性的原则方法是交叉验证,即通过未知数据进行预测测试。在这里,我想提醒大家注意交叉验证的误差幅度,它往往被低估了。简单的实验表明,许多神经影像学研究的样本量本身就会导致较大的误差幅度,例如 100 个样本的误差幅度为±10%。折叠间的标准误差大大低估了这些误差幅度。这些较大的误差幅度会影响使用预测模型得出的结论的可靠性,例如生物标志物或方法开发,与认知神经影像学 MVPA 方法不同,不能通过在许多受试者中重复实验来获得更多的样本。必须研究增加样本量的解决方案,以解决数据异质性可能增加的问题。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验