Burns Charles D G, Fracasso Alessio, Rousselet Guillaume A
School of Psychology and Neuroscience, University of Glasgow, G12 8QB, Glasgow, Scotland.
Sci Rep. 2025 Feb 19;15(1):6105. doi: 10.1038/s41598-025-89257-w.
Recent studies have used big neuroimaging datasets to answer an important question: how many subjects are required for reproducible brain-wide association studies? These data-driven approaches could be considered a framework for testing the reproducibility of several neuroimaging models and measures. Here we test part of this framework, namely estimates of statistical errors of univariate brain-behaviour associations obtained from resampling large datasets with replacement. We demonstrate that reported estimates of statistical errors are largely a consequence of bias introduced by random effects when sampling with replacement close to the full sample size. We show that future meta-analyses can largely avoid these biases by only resampling up to 10% of the full sample size. We discuss implications that reproducing mass-univariate association studies requires tens-of-thousands of participants, urging researchers to adopt other methodological approaches.
全脑关联研究要实现可重复性需要多少受试者?这些数据驱动的方法可被视为一个用于测试多种神经影像模型和测量方法可重复性的框架。在此,我们测试该框架的一部分,即通过有放回地重采样大型数据集获得的单变量脑-行为关联统计误差的估计值。我们证明,报告的统计误差估计值在很大程度上是由于在接近全样本量时有放回抽样时随机效应引入的偏差所致。我们表明,未来的荟萃分析通过仅对全样本量的10%进行重采样,在很大程度上可以避免这些偏差。我们讨论了重现大规模单变量关联研究需要数万名参与者的影响,敦促研究人员采用其他方法学途径。