van Aert Robbie C M, van Assen Marcel A L M
Department of Methodology and Statistics, Tilburg University, the Netherlands.
Department of Sociology, Utrecht University, the Netherlands.
PLoS One. 2017 Apr 7;12(4):e0175302. doi: 10.1371/journal.pone.0175302. eCollection 2017.
The vast majority of published results in the literature is statistically significant, which raises concerns about their reliability. The Reproducibility Project Psychology (RPP) and Experimental Economics Replication Project (EE-RP) both replicated a large number of published studies in psychology and economics. The original study and replication were statistically significant in 36.1% in RPP and 68.8% in EE-RP suggesting many null effects among the replicated studies. However, evidence in favor of the null hypothesis cannot be examined with null hypothesis significance testing. We developed a Bayesian meta-analysis method called snapshot hybrid that is easy to use and understand and quantifies the amount of evidence in favor of a zero, small, medium and large effect. The method computes posterior model probabilities for a zero, small, medium, and large effect and adjusts for publication bias by taking into account that the original study is statistically significant. We first analytically approximate the methods performance, and demonstrate the necessity to control for the original study's significance to enable the accumulation of evidence for a true zero effect. Then we applied the method to the data of RPP and EE-RP, showing that the underlying effect sizes of the included studies in EE-RP are generally larger than in RPP, but that the sample sizes of especially the included studies in RPP are often too small to draw definite conclusions about the true effect size. We also illustrate how snapshot hybrid can be used to determine the required sample size of the replication akin to power analysis in null hypothesis significance testing and present an easy to use web application (https://rvanaert.shinyapps.io/snapshot/) and R code for applying the method.
文献中绝大多数已发表的结果在统计上具有显著性,这引发了对其可靠性的担忧。心理学可重复性项目(RPP)和实验经济学复制项目(EE-RP)都对心理学和经济学领域大量已发表的研究进行了复制。在RPP中,原始研究和复制研究在统计上具有显著性的比例为36.1%,在EE-RP中为68.8%,这表明在复制研究中有许多零效应。然而,零假设显著性检验无法检验支持零假设的证据。我们开发了一种名为快照混合的贝叶斯元分析方法,该方法易于使用和理解,并能量化支持零效应、小效应、中等效应和大效应的证据量。该方法计算零效应、小效应、中等效应和大效应的后验模型概率,并通过考虑原始研究在统计上具有显著性来调整发表偏倚。我们首先通过分析近似该方法的性能,并证明控制原始研究的显著性对于积累支持真正零效应的证据的必要性。然后我们将该方法应用于RPP和EE-RP的数据,结果表明EE-RP中纳入研究的潜在效应量通常大于RPP中的,但尤其是RPP中纳入研究的样本量往往太小,无法就真正的效应量得出明确结论。我们还说明了如何使用快照混合方法来确定复制所需的样本量,这类似于零假设显著性检验中的功效分析,并展示了一个易于使用的网络应用程序(https://rvanaert.shinyapps.io/snapshot/)以及应用该方法的R代码。