Scott James G, Kelly Ryan C, Smith Matthew A, Zhou Pengcheng, Kass Robert E
University of Texas, Austin, USA.
Google, New York, USA.
J Am Stat Assoc. 2015;110(510):459-471. doi: 10.1080/01621459.2014.990973.
Many approaches for multiple testing begin with the assumption that all tests in a given study should be combined into a global false-discovery-rate analysis. But this may be inappropriate for many of today's large-scale screening problems, where auxiliary information about each test is often available, and where a combined analysis can lead to poorly calibrated error rates within different subsets of the experiment. To address this issue, we introduce an approach called false-discovery-rate regression that directly uses this auxiliary information to inform the outcome of each test. The method can be motivated by a two-groups model in which covariates are allowed to influence the local false discovery rate, or equivalently, the posterior probability that a given observation is a signal. This poses many subtle issues at the interface between inference and computation, and we investigate several variations of the overall approach. Simulation evidence suggests that: (1) when covariate effects are present, FDR regression improves power for a fixed false-discovery rate; and (2) when covariate effects are absent, the method is robust, in the sense that it does not lead to inflated error rates. We apply the method to neural recordings from primary visual cortex. The goal is to detect pairs of neurons that exhibit fine-time-scale interactions, in the sense that they fire together more often than expected due to chance. Our method detects roughly 50% more synchronous pairs versus a standard FDR-controlling analysis. The companion R package FDRreg implements all methods described in the paper.
许多多重检验方法都基于这样一个假设,即给定研究中的所有检验都应合并为一个全局错误发现率分析。但这对于当今许多大规模筛查问题可能并不合适,在这些问题中,通常可以获得关于每个检验的辅助信息,并且合并分析可能会导致实验不同子集中的错误率校准不佳。为了解决这个问题,我们引入了一种称为错误发现率回归的方法,该方法直接使用此辅助信息来确定每个检验的结果。该方法可以由一个两组模型来推动,在这个模型中,协变量被允许影响局部错误发现率,或者等效地,影响给定观测值是信号的后验概率。这在推理和计算的接口处带来了许多微妙的问题,并且我们研究了整体方法的几种变体。模拟证据表明:(1)当存在协变量效应时,错误发现率回归在固定错误发现率的情况下提高了检验功效;(2)当不存在协变量效应时,该方法是稳健的,即它不会导致错误率膨胀。我们将该方法应用于来自初级视觉皮层的神经记录。目标是检测出表现出精细时间尺度相互作用的神经元对,即它们一起放电的频率比随机情况下预期的更高。与标准的错误发现率控制分析相比,我们的方法检测到的同步对大约多50%。配套的R包FDRreg实现了本文中描述的所有方法。