Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou, China.
Hum Brain Mapp. 2022 Jan;43(1):244-254. doi: 10.1002/hbm.25154. Epub 2020 Aug 25.
The problem of poor reproducibility of scientific findings has received much attention over recent years, in a variety of fields including psychology and neuroscience. The problem has been partly attributed to publication bias and unwanted practices such as p-hacking. Low statistical power in individual studies is also understood to be an important factor. In a recent multisite collaborative study, we mapped brain anatomical left-right asymmetries for regional measures of surface area and cortical thickness, in 99 MRI datasets from around the world, for a total of over 17,000 participants. In the present study, we revisited these hemispheric effects from the perspective of reproducibility. Within each dataset, we considered that an effect had been reproduced when it matched the meta-analytic effect from the 98 other datasets, in terms of effect direction and significance threshold. In this sense, the results within each dataset were viewed as coming from separate studies in an "ideal publishing environment," that is, free from selective reporting and p hacking. We found an average reproducibility rate of 63.2% (SD = 22.9%, min = 22.2%, max = 97.0%). As expected, reproducibility was higher for larger effects and in larger datasets. Reproducibility was not obviously related to the age of participants, scanner field strength, FreeSurfer software version, cortical regional measurement reliability, or regional size. These findings constitute an empirical illustration of reproducibility in the absence of publication bias or p hacking, when assessing realistic biological effects in heterogeneous neuroscience data, and given typically-used sample sizes.
近年来,科学发现的可重复性差问题在包括心理学和神经科学在内的多个领域引起了广泛关注。该问题部分归因于发表偏倚和不希望出现的做法,例如 p-值操纵。个别研究中统计效能低也被认为是一个重要因素。在最近的一项多站点合作研究中,我们对来自世界各地的 99 个 MRI 数据集的区域表面积和皮质厚度的大脑解剖左右不对称性进行了映射,总共有超过 17000 名参与者。在本研究中,我们从可重复性的角度重新审视了这些半球效应。在每个数据集中,我们认为当一个效应在效应方向和显著性阈值方面与来自 98 个其他数据集的荟萃分析效应匹配时,该效应就得到了复制。从这个意义上说,每个数据集内的结果都被视为来自“理想出版环境”中独立的研究,即不受选择性报告和 p 值操纵的影响。我们发现平均可重复性为 63.2%(标准差=22.9%,最小值=22.2%,最大值=97.0%)。正如预期的那样,较大的效应和较大的数据集中的可重复性更高。可重复性与参与者的年龄、扫描仪场强、FreeSurfer 软件版本、皮质区域测量可靠性或区域大小无关。这些发现构成了在没有发表偏倚或 p 值操纵的情况下,在评估异质神经科学数据中的真实生物学效应并考虑到典型使用的样本量时,可重复性的实证例证。