Suppr超能文献

任意协方差依赖下错误发现比例的估计

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence.

作者信息

Fan Jianqing, Han Xu, Gu Weijie

机构信息

Department of Operations Research & Financial Engineering, Princeton University, Princeton, NJ 08544, USA and honorary professor, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China.

Department of Statistics, University of Florida, Florida, FL 32606.

出版信息

J Am Stat Assoc. 2012;107(499):1019-1035. doi: 10.1080/01621459.2012.720478.

Abstract

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any SNPs are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In the current paper, we propose a novel method based on principal factor approximation, which successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling FDR and FDP. Our estimate of realized FDP compares favorably with Efron (2007)'s approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure, which is more powerful than the fixed threshold procedure.

摘要

多重假设检验是高维推断中的一个基本问题,在许多科学领域都有广泛应用。在全基因组关联研究中,会同时进行数以万计的检验,以确定是否有任何单核苷酸多态性(SNP)与某些性状相关,并且这些检验是相关的。当检验统计量相关时,在任意相关性下控制错误发现率变得极具挑战性。在本文中,我们提出了一种基于主因子近似的新方法,该方法成功地减去了共同相关性,并显著削弱了相关结构,以处理任意相关结构。当使用共同阈值时,我们推导了大规模多重检验中错误发现比例(FDP)的近似表达式,并提供了实际FDP的一致估计。这一结果在控制错误发现率(FDR)和错误发现比例(FDP)方面具有重要应用。如模拟示例所示,我们对实际FDP的估计优于埃弗龙(2007年)的方法。我们的方法通过一些实际数据应用得到了进一步说明。我们还提出了一种依赖调整程序,它比固定阈值程序更有效。

相似文献

2
Estimation of the false discovery proportion with unknown dependence.在依赖关系未知的情况下对错误发现比例的估计。
J R Stat Soc Series B Stat Methodol. 2017 Sep;79(4):1143-1164. doi: 10.1111/rssb.12204. Epub 2016 Sep 26.
8
Estimation of false discovery proportion under general dependence.一般相关性下错误发现比例的估计
Bioinformatics. 2006 Dec 15;22(24):3025-31. doi: 10.1093/bioinformatics/btl527. Epub 2006 Oct 17.

引用本文的文献

2
Optimal Estimation of Genetic Relatedness in High-dimensional Linear Models.高维线性模型中遗传相关性的最优估计
J Am Stat Assoc. 2019;114(525):358-369. doi: 10.1080/01621459.2017.1407774. Epub 2018 Nov 19.
5
Mixture prior for sparse signals with dependent covariance structure.具有相依协方差结构的稀疏信号的混合先验。
PLoS One. 2023 Apr 27;18(4):e0284284. doi: 10.1371/journal.pone.0284284. eCollection 2023.

本文引用的文献

1
False Discovery Control in Large-Scale Spatial Multiple Testing.大规模空间多重检验中的错误发现控制
J R Stat Soc Series B Stat Methodol. 2015 Jan 1;77(1):59-83. doi: 10.1111/rssb.12064.
3
Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection.用于超高维变量选择的惩罚复合拟似然法
J R Stat Soc Series B Stat Methodol. 2011 Jun;73(3):325-349. doi: 10.1111/j.1467-9868.2010.00764.x.
5
A general framework for multiple testing dependence.多重检验相关性的通用框架。
Proc Natl Acad Sci U S A. 2008 Dec 2;105(48):18718-23. doi: 10.1073/pnas.0808709105. Epub 2008 Nov 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验