Suppr超能文献

一种用于测序数据遗传关联分析的加权U统计量。

A weighted U-statistic for genetic association analyses of sequencing data.

作者信息

Wei Changshuai, Li Ming, He Zihuai, Vsevolozhskaya Olga, Schaid Daniel J, Lu Qing

机构信息

Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, United States of America; Department of Biostatistics and Epidemiology, University of North Texas Health Science Center, Fort Worth, Texas, United States of America.

出版信息

Genet Epidemiol. 2014 Dec;38(8):699-708. doi: 10.1002/gepi.21864. Epub 2014 Oct 20.

Abstract

With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol.

摘要

随着下一代测序技术的进步,产生了大量的测序数据,这为全面研究罕见变异在复杂疾病遗传病因中的作用提供了绝佳机会。然而,高维测序数据给统计分析带来了巨大挑战。基于传统统计方法的关联分析由于遗传变异频率低和数据维度极高而遭受严重的功效损失。我们开发了一种加权U测序检验,称为WU-SEQ,用于测序数据的高维关联分析。基于非参数U统计量,WU-SEQ不假设潜在的疾病模型和表型分布,并且可以应用于多种表型。通过模拟研究和实证研究,我们表明,当潜在假设被违反时(例如,表型遵循重尾分布),WU-SEQ优于常用的序列核关联检验(SKAT)方法。即使假设得到满足,WU-SEQ仍能达到与SKAT相当的性能。最后,我们将WU-SEQ应用于达拉斯心脏研究(DHS)的测序数据,并检测到血管生成素样蛋白4(ANGPTL 4)与极低密度脂蛋白胆固醇之间的关联。

相似文献

1
A weighted U-statistic for genetic association analyses of sequencing data.
Genet Epidemiol. 2014 Dec;38(8):699-708. doi: 10.1002/gepi.21864. Epub 2014 Oct 20.
2
A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders.
Genet Epidemiol. 2012 Nov;36(7):675-85. doi: 10.1002/gepi.21662. Epub 2012 Aug 3.
3
A generalized genetic random field method for the genetic association analysis of sequencing data.
Genet Epidemiol. 2014 Apr;38(3):242-53. doi: 10.1002/gepi.21790. Epub 2014 Jan 30.
4
Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
Genet Epidemiol. 2013 Feb;37(2):142-51. doi: 10.1002/gepi.21699. Epub 2012 Nov 26.
6
Association studies for next-generation sequencing.
Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.
7
A power set-based statistical selection procedure to locate susceptible rare variants associated with complex traits with sequencing data.
Bioinformatics. 2014 Aug 15;30(16):2317-23. doi: 10.1093/bioinformatics/btu207. Epub 2014 Apr 22.

引用本文的文献

1
Considering Genetic Heterogeneity in the Association Analysis Finds Genes Associated With Nicotine Dependence.
Front Genet. 2019 May 17;10:448. doi: 10.3389/fgene.2019.00448. eCollection 2019.
2
An integrative U method for joint analysis of multi-level omic data.
BMC Genet. 2019 Apr 10;20(1):40. doi: 10.1186/s12863-019-0742-z.
3
Reexamining Dis/Similarity-Based Tests for Rare-Variant Association with Case-Control Samples.
Genetics. 2018 May;209(1):105-113. doi: 10.1534/genetics.118.300769. Epub 2018 Mar 15.
4
A functional U-statistic method for association analysis of sequencing data.
Genet Epidemiol. 2017 Nov;41(7):636-643. doi: 10.1002/gepi.22063. Epub 2017 Aug 29.
5
Genome-wide joint analysis of single-nucleotide variant sets and gene expression for hypertension and related phenotypes.
BMC Proc. 2016 Oct 18;10(Suppl 7):125-129. doi: 10.1186/s12919-016-0017-x. eCollection 2016.
6
Association Tests of Multiple Phenotypes: ATeMP.
PLoS One. 2015 Oct 19;10(10):e0140348. doi: 10.1371/journal.pone.0140348. eCollection 2015.
7
CARD14 alterations in Tunisian patients with psoriasis and further characterization in European cohorts.
Br J Dermatol. 2016 Feb;174(2):330-7. doi: 10.1111/bjd.14158. Epub 2015 Nov 17.
8
A powerful nonparametric statistical framework for family-based association analyses.
Genetics. 2015 May;200(1):69-78. doi: 10.1534/genetics.115.175174. Epub 2015 Mar 5.

本文引用的文献

1
Estimating genome-wide significance for whole-genome sequencing studies.
Genet Epidemiol. 2014 May;38(4):281-90. doi: 10.1002/gepi.21797. Epub 2014 Feb 14.
2
Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
Genet Epidemiol. 2013 Feb;37(2):142-51. doi: 10.1002/gepi.21699. Epub 2012 Nov 26.
3
An exponential combination procedure for set-based association tests in sequencing studies.
Am J Hum Genet. 2012 Dec 7;91(6):977-86. doi: 10.1016/j.ajhg.2012.09.017. Epub 2012 Nov 15.
4
Trees Assembling Mann-Whitney approach for detecting genome-wide joint association among low-marginal-effect loci.
Genet Epidemiol. 2013 Jan;37(1):84-91. doi: 10.1002/gepi.21693. Epub 2012 Nov 7.
5
Optimal tests for rare variant effects in sequencing association studies.
Biostatistics. 2012 Sep;13(4):762-75. doi: 10.1093/biostatistics/kxs014. Epub 2012 Jun 14.
6
Genome-environmental risk assessment of cocaine dependence.
Front Genet. 2012 May 18;3:83. doi: 10.3389/fgene.2012.00083. eCollection 2012.
7
U-statistics in genetic association studies.
Hum Genet. 2012 Sep;131(9):1395-401. doi: 10.1007/s00439-012-1178-y. Epub 2012 May 20.
8
Collapsing ROC approach for risk prediction research on both common and rare variants.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S42. doi: 10.1186/1753-6561-5-S9-S42.
9
The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals.
PLoS Genet. 2012 Feb;8(2):e1002496. doi: 10.1371/journal.pgen.1002496. Epub 2012 Feb 2.
10
Mapping rare and common causal alleles for complex human diseases.
Cell. 2011 Sep 30;147(1):57-69. doi: 10.1016/j.cell.2011.09.011.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验