School of Psychology, University of Sussex, Brighton, BN1 9QH, UK.
Lancaster University, Lancaster, UK.
Psychon Bull Rev. 2018 Feb;25(1):207-218. doi: 10.3758/s13423-017-1266-z.
Inference using significance testing and Bayes factors is compared and contrasted in five case studies based on real research. The first study illustrates that the methods will often agree, both in motivating researchers to conclude that H1 is supported better than H0, and the other way round, that H0 is better supported than H1. The next four, however, show that the methods will also often disagree. In these cases, the aim of the paper will be to motivate the sensible evidential conclusion, and then see which approach matches those intuitions. Specifically, it is shown that a high-powered non-significant result is consistent with no evidence for H0 over H1 worth mentioning, which a Bayes factor can show, and, conversely, that a low-powered non-significant result is consistent with substantial evidence for H0 over H1, again indicated by Bayesian analyses. The fourth study illustrates that a high-powered significant result may not amount to any evidence for H1 over H0, matching the Bayesian conclusion. Finally, the fifth study illustrates that different theories can be evidentially supported to different degrees by the same data; a fact that P-values cannot reflect but Bayes factors can. It is argued that appropriate conclusions match the Bayesian inferences, but not those based on significance testing, where they disagree.
在基于真实研究的五个案例研究中,对基于显著性检验和贝叶斯因子的推理进行了比较和对比。第一项研究表明,这两种方法通常会达成一致,无论是激励研究人员得出 H1 比 H0 得到更好支持的结论,还是反过来,H0 得到更好支持的结论。然而,接下来的四项研究表明,这两种方法也经常会产生分歧。在这些情况下,本文的目的将是激励合理的证据结论,然后看看哪种方法与这些直觉相符。具体来说,已经表明,高功效但非显著的结果与 H0 相对于 H1 没有值得一提的证据是一致的,这可以通过贝叶斯因子来显示,相反,低功效但非显著的结果与 H0 相对于 H1 的实质性证据是一致的,这再次通过贝叶斯分析表明。第四项研究表明,高功效但显著的结果可能并不等于 H1 相对于 H0 的任何证据,这与贝叶斯的结论相匹配。最后,第五项研究表明,相同的数据可以通过不同的理论得到不同程度的证据支持;这是 P 值无法反映但贝叶斯因子可以反映的事实。有人认为,适当的结论与贝叶斯推断相符,但与基于显著性检验的结论不符,在这些结论中存在分歧。