分组情况下的错误发现率控制

False Discovery Rate Control With Groups.

作者信息

Hu James X, Zhao Hongyu, Zhou Harrison H

机构信息

Department of Statistics, Yale University, New Haven, CT 06511.

出版信息

J Am Stat Assoc. 2010 Sep 1;105(491):1215-1227. doi: 10.1198/jasa.2010.tm09329.

DOI:10.1198/jasa.2010.tm09329

PMID:21931466

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3175141/

Abstract

In the context of large-scale multiple hypothesis testing, the hypotheses often possess certain group structures based on additional information such as Gene Ontology in gene expression data and phenotypes in genome-wide association studies. It is hence desirable to incorporate such information when dealing with multiplicity problems to increase statistical power. In this article, we demonstrate the benefit of considering group structure by presenting a p-value weighting procedure which utilizes the relative importance of each group while controlling the false discovery rate under weak conditions. The procedure is easy to implement and shown to be more powerful than the classical Benjamini-Hochberg procedure in both theoretical and simulation studies. By estimating the proportion of true null hypotheses, the data-driven procedure controls the false discovery rate asymptotically. Our analysis on one breast cancer dataset confirms that the procedure performs favorably compared with the classical method.

摘要

在大规模多重假设检验的背景下，基于诸如基因表达数据中的基因本体论和全基因组关联研究中的表型等附加信息，假设通常具有特定的组结构。因此，在处理多重性问题时纳入此类信息以提高统计功效是很有必要的。在本文中，我们通过提出一种p值加权程序来证明考虑组结构的益处，该程序在弱条件下控制错误发现率的同时利用了每个组的相对重要性。该程序易于实施，并且在理论和模拟研究中均显示出比经典的Benjamini-Hochberg程序更强大。通过估计真零假设的比例，数据驱动的程序渐近地控制错误发现率。我们对一个乳腺癌数据集的分析证实，与经典方法相比，该程序表现良好。

相似文献

False Discovery Rate Control With Groups.

J Am Stat Assoc. 2010 Sep 1;105(491):1215-1227. doi: 10.1198/jasa.2010.tm09329.

Comparison of methods for estimating the number of true null hypotheses in multiplicity testing.

J Biopharm Stat. 2003 Nov;13(4):675-89. doi: 10.1081/BIP-120024202.

Multiple testing with discrete data: Proportion of true null hypotheses and two adaptive FDR procedures.

Biom J. 2018 Jul;60(4):761-779. doi: 10.1002/bimj.201700157. Epub 2018 May 11.

Estimating the proportion of true null hypotheses and adaptive false discovery rate control in discrete paradigm.

Biom J. 2024 Mar;66(2):e2200204. doi: 10.1002/bimj.202200204.

Multiple testing in genome-wide association studies via hidden Markov models.

Bioinformatics. 2009 Nov 1;25(21):2802-8. doi: 10.1093/bioinformatics/btp476. Epub 2009 Aug 4.

On the operational characteristics of the Benjamini and Hochberg False Discovery Rate procedure.

Stat Appl Genet Mol Biol. 2007;6:Article27. doi: 10.2202/1544-6115.1302. Epub 2007 Oct 11.

POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.

Ann Stat. 2011 Feb;39(1):556-583. doi: 10.1214/10-aos844.

Expected Power for the False Discovery Rate with Independence.

Commun Stat Theory Methods. 2008 Jan;37(12):1855-1866. doi: 10.1080/03610920801893731.

Bias and variance reduction in estimating the proportion of true-null hypotheses.

Biostatistics. 2015 Jan;16(1):189-204. doi: 10.1093/biostatistics/kxu029. Epub 2014 Jun 23.

Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value error rates: focus on the false discovery rate and simulation study.

Biom J. 2008 Oct;50(5):716-44. doi: 10.1002/bimj.200710473.

引用本文的文献

Transfer Learning in Genome-Wide Association Studies with Knockoffs.

Sankhya B (2008). 2022 Nov 15. doi: 10.1007/s13571-022-00297-y.

Controlling the False Split Rate in Tree-Based Aggregation.

J Am Stat Assoc. 2025;120(550):935-947. doi: 10.1080/01621459.2024.2376285. Epub 2024 Sep 24.

Global Analysis of Nutritional Factors and Cardiovascular Risk: Insights from Worldwide Data and a Case Study in Mexican Children.

J Cardiovasc Dev Dis. 2025 Mar 25;12(4):115. doi: 10.3390/jcdd12040115.

Statistical methods leveraging the hierarchical structure of adverse events for signal detection in clinical trials: a scoping review of the methodological literature.

BMC Med Res Methodol. 2024 Oct 28;24(1):253. doi: 10.1186/s12874-024-02369-1.

Structural analysis of genomic and proteomic signatures reveal dynamic expression of intrinsically disordered regions in breast cancer.

iScience. 2024 Aug 7;27(9):110640. doi: 10.1016/j.isci.2024.110640. eCollection 2024 Sep 20.

False Discovery Rate Control for Lesion-Symptom Mapping With Heterogeneous Data via Weighted p-Values.

Biom J. 2024 Sep;66(6):e202300198. doi: 10.1002/bimj.202300198.

Control of false discoveries in grouped hypothesis testing for eQTL data.

BMC Bioinformatics. 2024 Apr 11;25(1):147. doi: 10.1186/s12859-024-05736-3.

Bioinformatics Methods for Transcriptome Analysis on Teratogenesis Testing.

Methods Mol Biol. 2024;2753:365-376. doi: 10.1007/978-1-0716-3625-1_20.

Association of Variants in Innate Immune Genes and with Reproductive and Milk Production Traits in Czech Simmental Cattle.

Genes (Basel). 2023 Dec 23;15(1):24. doi: 10.3390/genes15010024.

2dGBH: Two-dimensional group Benjamini-Hochberg procedure for false discovery rate control in two-way multiple testing of genomic data.

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae035.

本文引用的文献

Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies.

Genet Epidemiol. 2006 Sep;30(6):519-30. doi: 10.1002/gepi.20164.

Using linkage genome scans to improve power of association in genome scans.

Am J Hum Genet. 2006 Feb;78(2):243-52. doi: 10.1086/500026. Epub 2006 Jan 3.

Comparison of methods for estimating the number of true null hypotheses in multiplicity testing.

J Biopharm Stat. 2003 Nov;13(4):675-89. doi: 10.1081/BIP-120024202.

Gatekeeping strategies for clinical trials that do not require all primary effects to be significant.

Stat Med. 2003 Aug 15;22(15):2387-400. doi: 10.1002/sim.1526.

Empirical bayes methods and false discovery rates for microarrays.

Genet Epidemiol. 2002 Jun;23(1):70-86. doi: 10.1002/gepi.1124.

Gene expression profiling predicts clinical outcome of breast cancer.

Nature. 2002 Jan 31;415(6871):530-6. doi: 10.1038/415530a.

Computational analysis of microarray data.

Nat Rev Genet. 2001 Jun;2(6):418-27. doi: 10.1038/35076576.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Nat Genet. 2000 May;25(1):25-9. doi: 10.1038/75556.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

分组情况下的错误发现率控制

False Discovery Rate Control With Groups.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献