Bartroff Jay, Song Jinlin
Department of Mathematics, University of Southern California, Los Angeles, California, USA.
Analysis Group, Inc., Los Angeles, California, USA.
Seq Anal. 2020;39(1):65-91. doi: 10.1080/07474946.2020.1726686. Epub 2020 May 13.
We propose a general and flexible procedure for testing multiple hypotheses about sequential (or streaming) data that simultaneously controls both the false discovery rate (FDR) and false nondiscovery rate (FNR) under minimal assumptions about the data streams which may differ in distribution, dimension, and be dependent. All that is needed is a test statistic for each data stream that controls its conventional type I and II error probabilities, and no information or assumptions are required about the joint distribution of the statistics or data streams. The procedure can be used with sequential, group sequential, truncated, or other sampling schemes. The procedure is a natural extension of Benjamini and Hochberg's (1995) widely-used fixed sample size procedure to the domain of sequential data, with the added benefit of simultaneous FDR and FNR control that sequential sampling affords. We prove the procedure's error control and give some tips for implementation in commonly encountered testing situations.
我们提出了一种通用且灵活的程序,用于检验关于序贯(或流式)数据的多个假设,该程序在对可能在分布、维度上存在差异且相关的数据流的最小假设下,同时控制错误发现率(FDR)和错误未发现率(FNR)。所需的只是针对每个数据流的一个检验统计量,该统计量能控制其传统的I型和II型错误概率,并且不需要关于统计量或数据流联合分布的任何信息或假设。该程序可用于序贯、成组序贯、截断或其他抽样方案。此程序是Benjamini和Hochberg(1995)广泛使用的固定样本量程序在序贯数据领域的自然扩展,具有序贯抽样所带来的同时控制FDR和FNR的额外优势。我们证明了该程序的错误控制,并给出了在常见检验情况下实施的一些提示。