下一代测序的关联研究。

Association studies for next-generation sequencing.

机构信息

Human Genetics Center, University of Texas School of Public Health, Houston, TX 77030, USA.

出版信息

Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.

DOI:10.1101/gr.115998.110

PMID:21521787

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3129252/

Abstract

Genome-wide association studies (GWAS) have become the primary approach for identifying genes with common variants influencing complex diseases. Despite considerable progress, the common variations identified by GWAS account for only a small fraction of disease heritability and are unlikely to explain the majority of phenotypic variations of common diseases. A potential source of the missing heritability is the contribution of rare variants. Next-generation sequencing technologies will detect millions of novel rare variants, but these technologies have three defining features: identification of a large number of rare variants, a high proportion of sequence errors, and a large proportion of missing data. These features raise challenges for testing the association of rare variants with phenotypes of interest. In this study, we use a genome continuum model and functional principal components as a general principle for developing novel and powerful association analysis methods designed for resequencing data. We use simulations to calculate the type I error rates and the power of nine alternative statistics: two functional principal component analysis (FPCA)-based statistics, the multivariate principal component analysis (MPCA)-based statistic, the weighted sum (WSS), the variable-threshold (VT) method, the generalized T(2), the collapsing method, the CMC method, and individual tests. We also examined the impact of sequence errors on their type I error rates. Finally, we apply the nine statistics to the published resequencing data set from ANGPTL4 in the Dallas Heart Study. We report that FPCA-based statistics have a higher power to detect association of rare variants and a stronger ability to filter sequence errors than the other seven methods.

摘要

全基因组关联研究（GWAS）已成为鉴定常见变异影响复杂疾病的基因的主要方法。尽管取得了相当大的进展，但 GWAS 鉴定的常见变异仅占疾病遗传率的一小部分，不太可能解释常见疾病表型变异的大多数。遗传缺失的一个潜在来源是稀有变异的贡献。下一代测序技术将检测到数百万种新的稀有变异，但这些技术具有三个定义特征：大量稀有变异的识别、高比例的序列错误和大量缺失数据。这些特征为测试稀有变异与感兴趣的表型之间的关联带来了挑战。在这项研究中，我们使用基因组连续体模型和功能主成分作为开发针对重测序数据的新型强大关联分析方法的一般原则。我们使用模拟来计算 9 种替代统计量的Ⅰ型错误率和功效：两种基于功能主成分分析（FPCA）的统计量、基于多元主成分分析（MPCA）的统计量、加权和（WSS）、变量阈值（VT）方法、广义 T(2)、合并方法、CMC 方法和个体检验。我们还研究了序列错误对其Ⅰ型错误率的影响。最后，我们将这 9 种统计方法应用于达拉斯心脏研究中发表的 ANGPTL4 重测序数据集。我们报告称，基于 FPCA 的统计量在检测稀有变异的关联方面具有更高的功效，并且比其他七种方法具有更强的过滤序列错误的能力。

相似文献

Association studies for next-generation sequencing.

Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.

A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

J Comput Biol. 2012 Jun;19(6):731-44. doi: 10.1089/cmb.2012.0035. Epub 2012 May 31.

Family-based association studies for next-generation sequencing.

Am J Hum Genet. 2012 Jun 8;90(6):1028-45. doi: 10.1016/j.ajhg.2012.04.022.

Smoothed functional principal component analysis for testing association of the entire allelic spectrum of genetic variation.

Eur J Hum Genet. 2013 Feb;21(2):217-24. doi: 10.1038/ejhg.2012.141. Epub 2012 Jul 11.

A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions.

PLoS Genet. 2010 Oct 14;6(10):e1001156. doi: 10.1371/journal.pgen.1001156.

Weighted pedigree-based statistics for testing the association of rare variants.

BMC Genomics. 2012 Nov 24;13:667. doi: 10.1186/1471-2164-13-667.

A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders.

Genet Epidemiol. 2012 Nov;36(7):675-85. doi: 10.1002/gepi.21662. Epub 2012 Aug 3.

A permutation method for detecting trend correlations in rare variant association studies.

Genet Res (Camb). 2019 Dec 13;101:e13. doi: 10.1017/S0016672319000120.

A weighted U-statistic for genetic association analyses of sequencing data.

Genet Epidemiol. 2014 Dec;38(8):699-708. doi: 10.1002/gepi.21864. Epub 2014 Oct 20.

An evolutionary framework for association testing in resequencing studies.

PLoS Genet. 2010 Nov 11;6(11):e1001202. doi: 10.1371/journal.pgen.1001202.

引用本文的文献

An overview of recent technological developments in bovine genomics.

Vet Anim Sci. 2024 Jul 23;25:100382. doi: 10.1016/j.vas.2024.100382. eCollection 2024 Sep.

Next-Generation Sequencing Data-Based Association Testing of a Group of Genetic Markers for Complex Responses Using a Generalized Linear Model Framework.

Mathematics (Basel). 2023 Jun 1;11(11). doi: 10.3390/math11112560. Epub 2023 Jun 2.

A tree-based gene-environment interaction analysis with rare features.

Stat Anal Data Min. 2022 Oct;15(5):648-674. doi: 10.1002/sam.11578. Epub 2022 Mar 1.

VIVID: A Web Application for Variant Interpretation and Visualization in Multi-dimensional Analyses.

Mol Biol Evol. 2022 Sep 1;39(9). doi: 10.1093/molbev/msac196.

Protein Sequencing, One Molecule at a Time.

Annu Rev Biophys. 2022 May 9;51:181-200. doi: 10.1146/annurev-biophys-102121-103615. Epub 2022 Jan 5.

A Multi-Marker Test for Analyzing Paired Genetic Data in Transplantation.

Front Genet. 2021 Oct 13;12:745773. doi: 10.3389/fgene.2021.745773. eCollection 2021.

Gene-Based Association Testing of Dichotomous Traits With Generalized Functional Linear Mixed Models Using Extended Pedigrees: Applications to Age-Related Macular Degeneration.

J Am Stat Assoc. 2021;116(534):531-545. doi: 10.1080/01621459.2020.1799809. Epub 2020 Jul 28.

Integrative functional linear model for genome-wide association studies with multiple traits.

Biostatistics. 2022 Apr 13;23(2):574-590. doi: 10.1093/biostatistics/kxaa043.

Genomic, proteomic, and systems biology approaches in biomarker discovery for multiple sclerosis.

Cell Immunol. 2020 Dec;358:104219. doi: 10.1016/j.cellimm.2020.104219. Epub 2020 Sep 20.

Adaptive Fisher method detects dense and sparse signals in association analysis of SNV sets.

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):46. doi: 10.1186/s12920-020-0684-3.

本文引用的文献

To identify associations with rare variants, just WHaIT: Weighted haplotype and imputation-based tests.

Am J Hum Genet. 2010 Nov 12;87(5):728-35. doi: 10.1016/j.ajhg.2010.10.014. Epub 2010 Nov 4.

Statistical analysis strategies for association studies involving rare variants.

Nat Rev Genet. 2010 Nov;11(11):773-85. doi: 10.1038/nrg2867. Epub 2010 Oct 13.

Pooled association tests for rare variants in exon-resequencing studies.

Am J Hum Genet. 2010 Jun 11;86(6):832-8. doi: 10.1016/j.ajhg.2010.04.005. Epub 2010 May 13.

Accurate detection and genotyping of SNPs utilizing population sequencing data.

Genome Res. 2010 Apr;20(4):537-45. doi: 10.1101/gr.100040.109. Epub 2010 Feb 11.

Rare variants create synthetic genome-wide associations.

PLoS Biol. 2010 Jan 26;8(1):e1000294. doi: 10.1371/journal.pbio.1000294.

Population genetic inference from genomic sequence variation.

Genome Res. 2010 Mar;20(3):291-300. doi: 10.1101/gr.079509.108. Epub 2010 Jan 12.

Detecting rare variants for complex traits using family and unrelated data.

Genet Epidemiol. 2010 Feb;34(2):171-87. doi: 10.1002/gepi.20449.

Finding the missing heritability of complex diseases.

Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494.

Common vs. rare allele hypotheses for complex diseases.

Curr Opin Genet Dev. 2009 Jun;19(3):212-9. doi: 10.1016/j.gde.2009.04.010. Epub 2009 May 28.

Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7. doi: 10.1073/pnas.0903103106. Epub 2009 May 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

下一代测序的关联研究。

Association studies for next-generation sequencing.

机构信息

Human Genetics Center, University of Texas School of Public Health, Houston, TX 77030, USA.

出版信息

Genome Res. 2011 Jul;21(7):1099-108. doi: 10.1101/gr.115998.110. Epub 2011 Apr 26.

DOI:10.1101/gr.115998.110

PMID:21521787

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3129252/

Abstract

摘要

下一代测序的关联研究。

Association studies for next-generation sequencing.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

下一代测序的关联研究。

Association studies for next-generation sequencing.

机构信息

出版信息