Department of Psychology, Bowling Green State University, Bowling Green, OH, 43402, USA.
Behav Res Methods. 2022 Jun;54(3):1131-1147. doi: 10.3758/s13428-021-01618-1. Epub 2021 Sep 7.
Prior work by Michael R. Dougherty and colleagues (Yu et al., 2014) shows that when a scientist monitors the p value during data collection and uses a critical p as the signal to stop collecting data, the resulting p is distorted due to Type I error-rate inflation. They argued similarly that the use of a critical Bayes factor (BF) for stopping distorts the obtained Bayes factor (BF), a position that has met with controversy. The present paper clarified that when BF is used as a stopping criterion, the sample becomes biased in that data consistent with large effects have a greater chance to be included than do other data, thus biasing the input to Bayesian inference. We report simulations of yoked pairs of scientists in which Scientist A uses BF to optionally stop, while Scientist B, sampling from the same population, stops when A stops. Thus, optional stopping is compared not to a hypothetical in which no stopping occurs, but to a situation in which B stops for reasons unrelated to the characteristics of B's sample. The results indicated that optional stopping biased the input for Bayesian inference. We also simulated the use of effect-size stabilization as a stopping criterion and found no bias in that case.
迈克尔·R·道格蒂(Michael R. Dougherty)及其同事的先前研究(Yu 等人,2014)表明,当科学家在数据收集过程中监测 p 值并使用临界 p 值作为停止收集数据的信号时,由于第一类错误率膨胀,所得 p 值会发生扭曲。他们同样认为,使用临界贝叶斯因子(BF)来停止会扭曲获得的贝叶斯因子(BF),这一立场引起了争议。本文澄清了当 BF 被用作停止标准时,样本会出现偏差,因为与大效应一致的数据比其他数据更有可能被纳入,从而影响贝叶斯推断的输入。我们报告了对一对配对科学家的模拟,其中科学家 A 使用 BF 可选地停止,而科学家 B 则从相同的总体中抽样,当 A 停止时停止。因此,可选停止不是与假设中没有停止的情况进行比较,而是与 B 因与 B 的样本特征无关的原因而停止的情况进行比较。结果表明,可选停止会影响贝叶斯推断的输入。我们还模拟了使用效应大小稳定作为停止标准的情况,并且在这种情况下没有发现偏差。