Department of Data Analysis, Ghent University, Ghent, Belgium.
Department of Data Analysis, Ghent University, Ghent, Belgium.
Neuroimage. 2020 May 15;212:116601. doi: 10.1016/j.neuroimage.2020.116601. Epub 2020 Feb 7.
Replicating results (i.e. obtaining consistent results using a new independent dataset) is an essential part of good science. As replicability has consequences for theories derived from empirical studies, it is of utmost importance to better understand the underlying mechanisms influencing it. A popular tool for non-invasive neuroimaging studies is functional magnetic resonance imaging (fMRI). While the effect of underpowered studies is well documented, the empirical assessment of the interplay between sample size and replicability of results for task-based fMRI studies remains limited. In this work, we extend existing work on this assessment in two ways. Firstly, we use a large database of 1400 subjects performing four types of tasks from the IMAGEN project to subsample a series of independent samples of increasing size. Secondly, replicability is evaluated using a multi-dimensional framework consisting of 3 different measures: (un)conditional test-retest reliability, coherence and stability. We demonstrate not only a positive effect of sample size, but also a trade-off between spatial resolution and replicability. When replicability is assessed voxelwise or when observing small areas of activation, a larger sample size than typically used in fMRI is required to replicate results. On the other hand, when focussing on clusters of voxels, we observe a higher replicability. In addition, we observe variability in the size of clusters of activation between experimental paradigms or contrasts of parameter estimates within these.
复制结果(即使用新的独立数据集获得一致的结果)是科学研究的重要组成部分。由于可复制性对从经验研究中得出的理论有影响,因此了解影响可复制性的潜在机制至关重要。功能磁共振成像(fMRI)是一种用于非侵入性神经影像学研究的流行工具。虽然研究力量不足的影响已有充分记录,但对于基于任务的 fMRI 研究结果的样本量和可复制性之间相互作用的实证评估仍然有限。在这项工作中,我们以两种方式扩展了对此评估的现有工作。首先,我们使用来自 IMAGEN 项目的 1400 名受试者执行的四种类型任务的大型数据库,对一系列大小不断增加的独立样本进行抽样。其次,使用由 3 种不同度量标准组成的多维框架来评估可重复性:(无条件)测试 - 再测试可靠性、一致性和稳定性。我们不仅证明了样本量的积极影响,还证明了空间分辨率和可复制性之间的权衡。当在体素水平评估可重复性或观察小的激活区域时,需要比 fMRI 中通常使用的更大的样本量来复制结果。另一方面,当关注体素簇时,我们观察到更高的可重复性。此外,我们观察到在实验范式之间的激活簇的大小或在这些中的参数估计的对比之间存在可变性。