Office of Research and Academia-Government-Community Collaboration, Education and Research Center for Artificial Intelligence and Data Innovation, Hiroshima University, Hiroshima, Japan.
University of Groningen, Groningen, The Netherlands.
Psychon Bull Rev. 2022 Feb;29(1):70-87. doi: 10.3758/s13423-021-01962-5. Epub 2021 Jul 12.
The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well known that optional stopping is problematic in the context of p value-based null hypothesis significance testing: The false-positive rates quickly overcome the single test's significance level. However, the state of affairs under null hypothesis Bayesian testing, where p values are replaced by Bayes factors, has perhaps surprisingly been much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. Deng et al. (2016) and Hendriksen et al. (2020) have provided mathematical evidence to the effect that optional stopping under null hypothesis Bayesian testing does hold under some conditions. These papers are, however, exceedingly technical for most researchers in the applied social sciences. In this paper, we provide some mathematical derivations concerning Rouder's approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers an intuitive perspective to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.
随着数据的收集,依次检验零假设,直到拒绝零假设,这种做法被称为可选停止。众所周知,在基于 p 值的零假设显著性检验的背景下,可选停止是有问题的:错误的阳性率很快就超过了单次检验的显著性水平。然而,令人惊讶的是,在零假设贝叶斯检验下的情况,其中 p 值被贝叶斯因子取代,情况可能并没有那么一致。Rouder(2014)使用模拟来为零假设贝叶斯检验下的可选停止辩护。这些模拟背后的想法与从先验预测分布中抽样的想法密切相关。Deng 等人(2016)和 Hendriksen 等人(2020)提供了数学证据,表明在某些条件下,零假设贝叶斯检验下的可选停止确实成立。然而,对于应用社会科学领域的大多数研究人员来说,这些论文都非常技术性。在本文中,我们提供了一些关于 Rouder 对他考虑的两种贝叶斯假设检验的近似模拟结果的数学推导。关键思想是考虑贝叶斯因子的概率分布,它被视为在重复抽样中是一个随机变量。因此,本文为文献提供了一个直观的视角,我们相信这是对理解贝叶斯假设检验中可选停止实践的一个有价值的贡献。