Suppr超能文献

全基因组关联研究的协变量自适应家族性错误率控制

Covariate adaptive familywise error rate control for genome-wide association studies.

作者信息

Zhou Huijuan, Zhang Xianyang, Chen Jun

机构信息

Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China.

Department of Statistics, Texas A&M University, College Station, Texas 77843, U.S.A.

出版信息

Biometrika. 2020 Nov 27;108(4):915-931. doi: 10.1093/biomet/asaa098. eCollection 2021 Dec.

Abstract

The familywise error rate has been widely used in genome-wide association studies. With the increasing availability of functional genomics data, it is possible to increase detection power by leveraging these genomic functional annotations. Previous efforts to accommodate covariates in multiple testing focused on false discovery rate control, while covariate-adaptive procedures controlling the familywise error rate remain underdeveloped. Here, we propose a novel covariate-adaptive procedure to control the familywise error rate that incorporates external covariates which are potentially informative of either the statistical power or the prior null probability. An efficient algorithm is developed to implement the proposed method. We prove its asymptotic validity and obtain the rate of convergence through a perturbation-type argument. Our numerical studies show that the new procedure is more powerful than competing methods and maintains robustness across different settings. We apply the proposed approach to the UK Biobank data and analyse 27 traits with 9 million single-nucleotide polymorphisms tested for associations. Seventy-five genomic annotations are used as covariates. Our approach detects more genome-wide significant loci than other methods in 21 out of the 27 traits.

摘要

家族性错误率已在全基因组关联研究中广泛使用。随着功能基因组学数据的日益可得,利用这些基因组功能注释来提高检测能力成为可能。先前在多重检验中纳入协变量的工作主要集中在错误发现率控制上,而控制家族性错误率的协变量自适应程序仍未得到充分发展。在此,我们提出一种新颖的协变量自适应程序来控制家族性错误率,该程序纳入了外部协变量,这些协变量可能对统计能力或先验零概率具有潜在信息。我们开发了一种高效算法来实现所提出的方法。我们证明了其渐近有效性,并通过一种扰动型论证获得了收敛速度。我们的数值研究表明,新程序比其他竞争方法更具检测能力,并且在不同设置下都保持稳健性。我们将所提出的方法应用于英国生物银行数据,并分析了27种性状,对900万个单核苷酸多态性进行了关联测试。使用了75种基因组注释作为协变量。在27种性状中的21种性状上,我们的方法比其他方法检测到更多的全基因组显著位点。

相似文献

1
Covariate adaptive familywise error rate control for genome-wide association studies.
Biometrika. 2020 Nov 27;108(4):915-931. doi: 10.1093/biomet/asaa098. eCollection 2021 Dec.
2
Online control of the familywise error rate.
Stat Methods Med Res. 2021 Apr;30(4):976-993. doi: 10.1177/0962280220983381. Epub 2021 Jan 7.
4
On optimal two-stage testing of multiple mediators.
Biom J. 2022 Aug;64(6):1090-1108. doi: 10.1002/bimj.202100190. Epub 2022 Apr 14.
5
Controlling the rate of Type I error over a large set of statistical tests.
Br J Math Stat Psychol. 2002 May;55(Pt 1):27-39. doi: 10.1348/000711002159680.
6
Covariate-modulated local false discovery rate for genome-wide association studies.
Bioinformatics. 2014 Aug 1;30(15):2098-104. doi: 10.1093/bioinformatics/btu145. Epub 2014 Apr 7.
7
Graphical approaches for the control of generalized error rates.
Stat Med. 2020 Oct 15;39(23):3135-3155. doi: 10.1002/sim.8595. Epub 2020 Jun 17.
8
Leveraging auxiliary data from arbitrary distributions to boost GWAS discovery with Flexible cFDR.
PLoS Genet. 2021 Oct 20;17(10):e1009853. doi: 10.1371/journal.pgen.1009853. eCollection 2021 Oct.
9
The impact of misclassification on covariate-adaptive randomized clinical trials.
Biometrics. 2021 Jun;77(2):451-464. doi: 10.1111/biom.13308. Epub 2020 Jun 7.
10
Leveraging Polygenic Functional Enrichment to Improve GWAS Power.
Am J Hum Genet. 2019 Jan 3;104(1):65-75. doi: 10.1016/j.ajhg.2018.11.008. Epub 2018 Dec 27.

引用本文的文献

1
DYNAMIC PREDICTION WITH MULTIVARIATE LONGITUDINAL OUTCOMES AND LONGITUDINAL MAGNETIC RESONANCE IMAGING DATA.
Ann Appl Stat. 2025 Mar;19(1):505-528. doi: 10.1214/24-aoas1970. Epub 2025 Mar 17.
2
LinDA: linear models for differential abundance analysis of microbiome compositional data.
Genome Biol. 2022 Apr 14;23(1):95. doi: 10.1186/s13059-022-02655-5.

本文引用的文献

2
Leveraging Polygenic Functional Enrichment to Improve GWAS Power.
Am J Hum Genet. 2019 Jan 3;104(1):65-75. doi: 10.1016/j.ajhg.2018.11.008. Epub 2018 Dec 27.
3
A direct approach to estimating false discovery rates conditional on covariates.
PeerJ. 2018 Dec 10;6:e6035. doi: 10.7717/peerj.6035. eCollection 2018.
4
Mixed-model association for biobank-scale datasets.
Nat Genet. 2018 Jul;50(7):906-908. doi: 10.1038/s41588-018-0144-6.
5
Genetic effects on gene expression across human tissues.
Nature. 2017 Oct 11;550(7675):204-213. doi: 10.1038/nature24277.
6
7
False discovery rates: a new deal.
Biostatistics. 2017 Apr 1;18(2):275-294. doi: 10.1093/biostatistics/kxw041.
8
Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.
Nat Methods. 2016 Jul;13(7):577-80. doi: 10.1038/nmeth.3885. Epub 2016 May 30.
9
Optimal multiple testing under a Gaussian prior on the effect sizes.
Biometrika. 2015 Dec;102(4):753-766. doi: 10.1093/biomet/asv050. Epub 2015 Nov 4.
10
False discovery rate regression: an application to neural synchrony detection in primary visual cortex.
J Am Stat Assoc. 2015;110(510):459-471. doi: 10.1080/01621459.2014.990973.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验