Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.
Department of Psychology, Ludwig-Maximilians-Universität München, München, Germany.
Behav Res Methods. 2022 Dec;54(6):3100-3117. doi: 10.3758/s13428-021-01754-8. Epub 2022 Mar 1.
In a sequential hypothesis test, the analyst checks at multiple steps during data collection whether sufficient evidence has accrued to make a decision about the tested hypotheses. As soon as sufficient information has been obtained, data collection is terminated. Here, we compare two sequential hypothesis testing procedures that have recently been proposed for use in psychological research: Sequential Probability Ratio Test (SPRT; Psychological Methods, 25(2), 206-226, 2020) and the Sequential Bayes Factor Test (SBFT; Psychological Methods, 22(2), 322-339, 2017). We show that although the two methods have different philosophical roots, they share many similarities and can even be mathematically regarded as two instances of an overarching hypothesis testing framework. We demonstrate that the two methods use the same mechanisms for evidence monitoring and error control, and that differences in efficiency between the methods depend on the exact specification of the statistical models involved, as well as on the population truth. Our simulations indicate that when deciding on a sequential design within a unified sequential testing framework, researchers need to balance the needs of test efficiency, robustness against model misspecification, and appropriate uncertainty quantification. We provide guidance for navigating these design decisions based on individual preferences and simulation-based design analyses.
在序贯假设检验中,分析师在数据收集的多个步骤中检查是否有足够的证据来对所测试的假设做出决策。一旦获得了足够的信息,就会终止数据收集。在这里,我们比较了最近在心理研究中提出的两种序贯假设检验程序:序贯概率比检验(SPRT;《心理方法》,25(2),206-226,2020)和序贯贝叶斯因子检验(SBFT;《心理方法》,22(2),322-339,2017)。我们表明,尽管这两种方法具有不同的哲学基础,但它们有许多相似之处,甚至可以在数学上被视为一个总体假设检验框架的两个实例。我们证明,这两种方法使用相同的证据监测和错误控制机制,并且方法之间的效率差异取决于所涉及的统计模型的精确规范,以及总体真实情况。我们的模拟表明,当在统一的序贯检验框架内决定序贯设计时,研究人员需要平衡测试效率、对模型失拟的稳健性以及适当的不确定性量化的需求。我们根据个人偏好和基于模拟的设计分析为这些设计决策提供指导。