Department of Methodology & Statistics, Faculty of Psychology, Open University of the Netherlands, Heerlen, The Netherlands.
Department of Work & Social Psychology, Faculty of Psychology & Neuroscience, Maastricht University, Maastricht, The Netherlands.
Psychol Health. 2021 Jan;36(1):59-77. doi: 10.1080/08870446.2020.1757098. Epub 2020 May 7.
Although basing conclusions on confidence intervals for effect size estimates is preferred over relying on null hypothesis significance testing alone, confidence intervals in psychology are typically very wide. One reason may be a lack of easily applicable methods for planning studies to achieve sufficiently tight confidence intervals. This paper presents tables and freely accessible tools to facilitate planning studies for the desired accuracy in parameter estimation for a common effect size (Cohen's d). In addition, the importance of such accuracy is demonstrated using data from the Reproducibility Project: Psychology (RPP).
It is shown that the sampling distribution of Cohen's d is very wide unless sample sizes are considerably larger than what is common in psychology studies. This means that effect size estimates can vary substantially from sample to sample, even with perfect replications. The RPP replications' confidence intervals for Cohen's d have widths of around 1 standard deviation (95% confidence interval from 1.05 to 1.39). Therefore, point estimates obtained in replications are likely to vary substantially from the estimates from earlier studies.
The implication is that researchers in psychology -and funders- will have to get used to conducting considerably larger studies if they are to build a strong evidence base.
虽然基于置信区间来评估效应大小的结论比仅仅依靠零假设显著性检验更为可取,但心理学中的置信区间通常非常宽。原因之一可能是缺乏易于应用的方法来规划研究,以实现所需的参数估计的置信区间的紧密度。本文提供了表格和免费的工具,以方便规划研究,以实现常见效应大小(Cohen's d)的所需精度。此外,还使用来自心理学再现性项目(RPP)的数据来证明这种精度的重要性。
结果表明,除非样本量远大于心理学研究中的常见样本量,否则 Cohen's d 的抽样分布非常宽。这意味着即使是完美的重复,效应大小的估计也会从样本到样本发生很大的变化。RPP 重复的 Cohen's d 的置信区间宽度约为 1 个标准差(95%置信区间为 1.05 到 1.39)。因此,重复中获得的点估计值很可能与早期研究中的估计值有很大的差异。
这意味着心理学研究人员——以及资助者——如果要建立一个强有力的证据基础,就必须习惯于进行规模大得多的研究。