Loper J H, Lei L, Fithian W, Tansey W
Department of Neuroscience, Columbia University, 716 Jerome L. Greene Building, New York, New York 10025, U.S.A.
Department of Statistics, Stanford University, Sequoia Hall, Palo Alto, California 94305, U.S.A.
Biometrika. 2022 Jun;109(2):457-471. doi: 10.1093/biomet/asab041. Epub 2021 Jul 2.
We consider the problem of multiple hypothesis testing when there is a logical nested structure to the hypotheses. When one hypothesis is nested inside another, the outer hypothesis must be false if the inner hypothesis is false. We model the nested structure as a directed acyclic graph, including chain and tree graphs as special cases. Each node in the graph is a hypothesis and rejecting a node requires also rejecting all of its ancestors. We propose a general framework for adjusting node-level test statistics using the known logical constraints. Within this framework, we study a smoothing procedure that combines each node with all of its descendants to form a more powerful statistic. We prove a broad class of smoothing strategies can be used with existing selection procedures to control the familywise error rate, false discovery exceedance rate, or false discovery rate, so long as the original test statistics are independent under the null. When the null statistics are not independent but are derived from positively-correlated normal observations, we prove control for all three error rates when the smoothing method is arithmetic averaging of the observations. Simulations and an application to a real biology dataset demonstrate that smoothing leads to substantial power gains.
当假设存在逻辑嵌套结构时,我们考虑多重假设检验的问题。当一个假设嵌套在另一个假设之中时,如果内部假设为假,那么外部假设必定为假。我们将嵌套结构建模为有向无环图,链图和树图作为特殊情况包含在内。图中的每个节点都是一个假设,拒绝一个节点也需要拒绝它的所有祖先节点。我们提出了一个使用已知逻辑约束来调整节点级检验统计量的通用框架。在此框架内,我们研究一种平滑程序,该程序将每个节点与其所有后代节点相结合,以形成一个更强大的统计量。我们证明,只要原检验统计量在原假设下是独立的,那么一类广泛的平滑策略可与现有的选择程序一起使用,以控制族系错误率、错误发现超标率或错误发现率。当原假设统计量不独立但源自正相关的正态观测值时,我们证明当平滑方法是观测值的算术平均时,可控制所有三种错误率。模拟以及对一个真实生物学数据集的应用表明,平滑会带来显著的功效提升。