Gablenz Paula, Sabatti Chiara
Department of Statistics, Stanford University, 390 Jane Stanford Way, Stanford, CA 94305-4020 USA.
Department of Biomedical Data Science, Stanford University, Medical School Office Building 1265 Welch Road MC5464, Stanford, CA 94305-5464, USA.
J R Stat Soc Series B Stat Methodol. 2024 Jun 14;87(1):56-73. doi: 10.1093/jrsssb/qkae042. eCollection 2025 Feb.
We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage -values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analysing data from the UK Biobank.
我们考虑这样一些问题,即要对许多多少有些冗余的假设进行检验,并且我们希望在控制错误发现率(FDR)的情况下报告最精确的拒绝结果。例如,当研究人员在几个分辨率水平上既对单个假设又对与原始假设集的交集相对应的组假设感兴趣时,就是这种情况。一个具体应用是在全基因组关联研究中,根据信号强度,有可能以更高或更低的精度解析单个基因变异对表型的影响。为了适应未知的信号强度,会在多个分辨率下进行分析,研究人员最感兴趣的是更精确的发现结果。然而,要通过这些自适应搜索确保对报告结果的FDR控制通常是不可能的。为了设计一种在控制FDR的情况下允许自适应选择分辨率的多重比较程序,我们利用p值和线性规划。我们将这种方法应用于已成功应用仿冒品和组仿冒品来检验条件独立性假设的问题。我们通过分析英国生物银行的数据来证明其有效性。