使用子树协方差平滑优化遗传推断关系

REFINING GENETICALLY INFERRED RELATIONSHIPS USING TREELET COVARIANCE SMOOTHING.

作者信息

Crossett Andrew, Lee Ann B, Klei Lambertus, Devlin Bernie, Roeder Kathryn

机构信息

West Chester University, Carnegie Mellon University, University of Pittsburgh School of Medicine, University of Pittsburgh School of Medicine and Carnegie Mellon University.

出版信息

Ann Appl Stat. 2013 Jun 27;7(2):669-690. doi: 10.1214/12-AOAS598.

DOI:10.1214/12-AOAS598

PMID:24587841

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3935431/

Abstract

Recent technological advances coupled with large sample sets have uncovered many factors underlying the genetic basis of traits and the predisposition to complex disease, but much is left to discover. A common thread to most genetic investigations is familial relationships. Close relatives can be identified from family records, and more distant relatives can be inferred from large panels of genetic markers. Unfortunately these empirical estimates can be noisy, especially regarding distant relatives. We propose a new method for denoising genetically-inferred relationship matrices by exploiting the underlying structure due to hierarchical groupings of correlated individuals. The approach, which we call Treelet Covariance Smoothing, employs a multiscale decomposition of covariance matrices to improve estimates of pairwise relationships. On both simulated and real data, we show that smoothing leads to better estimates of the relatedness amongst distantly related individuals. We illustrate our method with a large genome-wide association study and estimate the "heritability" of body mass index quite accurately. Traditionally heritability, defined as the fraction of the total trait variance attributable to additive genetic effects, is estimated from samples of closely related individuals using random effects models. We show that by using smoothed relationship matrices we can estimate heritability using population-based samples. Finally, while our methods have been developed for refining genetic relationship matrices and improving estimates of heritability, they have much broader potential application in statistics. Most notably, for error-in-variables random effects models and settings that require regularization of matrices with block or hierarchical structure.

摘要

近期的技术进步与大规模样本集相结合，揭示了许多性状遗传基础和复杂疾病易感性背后的因素，但仍有许多有待发现。大多数基因研究的一个共同线索是家族关系。可以从家族记录中识别近亲，而更远的亲属可以从大量基因标记中推断出来。不幸的是，这些经验估计可能存在噪声，尤其是对于远亲而言。我们提出了一种新方法，通过利用相关个体分层分组所产生的潜在结构，对基因推断的关系矩阵进行去噪。我们将这种方法称为小波协方差平滑，它采用协方差矩阵的多尺度分解来改进成对关系的估计。在模拟数据和真实数据上，我们都表明平滑处理能更好地估计远亲个体之间的亲缘关系。我们通过一项大型全基因组关联研究来说明我们的方法，并相当准确地估计了体重指数的“遗传力”。传统上，遗传力定义为总性状变异中可归因于加性遗传效应的比例，是使用随机效应模型从近亲个体样本中估计出来的。我们表明，通过使用平滑后的关系矩阵，我们可以使用基于人群的样本估计遗传力。最后，虽然我们的方法是为了优化基因关系矩阵和改进遗传力估计而开发的，但它们在统计学中有更广泛的潜在应用。最值得注意的是，对于变量误差随机效应模型以及需要对具有块结构或分层结构的矩阵进行正则化的情况。

相似文献

REFINING GENETICALLY INFERRED RELATIONSHIPS USING TREELET COVARIANCE SMOOTHING.

Ann Appl Stat. 2013 Jun 27;7(2):669-690. doi: 10.1214/12-AOAS598.

Estimation of heritability from limited family data using genome-wide identity-by-descent sharing.

Genet Sel Evol. 2012 May 8;44(1):16. doi: 10.1186/1297-9686-44-16.

The phenome-wide distribution of genetic variance.

Am Nat. 2015 Jul;186(1):15-30. doi: 10.1086/681645. Epub 2015 May 12.

The estimation of additive genetic variance of body size in a wild passerine is sensitive to the method used to estimate relatedness among the individuals.

Ecol Evol. 2024 Feb 13;14(2):e10981. doi: 10.1002/ece3.10981. eCollection 2024 Feb.

Genomic kinship construction to enhance genetic analyses in the human connectome project data.

Hum Brain Mapp. 2019 Apr 1;40(5):1677-1688. doi: 10.1002/hbm.24479. Epub 2018 Nov 29.

Limitations of GCTA as a solution to the missing heritability problem.

Proc Natl Acad Sci U S A. 2016 Jan 5;113(1):E61-70. doi: 10.1073/pnas.1520109113. Epub 2015 Dec 22.

Human Facial Shape and Size Heritability and Genetic Correlations.

Genetics. 2017 Feb;205(2):967-978. doi: 10.1534/genetics.116.193185. Epub 2016 Dec 14.

Marker-based estimation of heritability in immortal populations.

Genetics. 2015 Feb;199(2):379-98. doi: 10.1534/genetics.114.167916. Epub 2014 Dec 19.

Pedigree-free animal models: the relatedness matrix reloaded.

Proc Biol Sci. 2008 Mar 22;275(1635):639-47. doi: 10.1098/rspb.2007.1032.

Localising loci underlying complex trait variation using Regional Genomic Relationship Mapping.

PLoS One. 2012;7(10):e46501. doi: 10.1371/journal.pone.0046501. Epub 2012 Oct 15.

引用本文的文献

Analysis of Shared Haplotypes amongst Palauans Maps Loci for Psychotic Disorders to 4q28 and 5q23-q31.

Mol Neuropsychiatry. 2017 Feb;2(4):173-184. doi: 10.1159/000450726. Epub 2016 Oct 12.

Limitations of GCTA as a solution to the missing heritability problem.

Proc Natl Acad Sci U S A. 2016 Jan 5;113(1):E61-70. doi: 10.1073/pnas.1520109113. Epub 2015 Dec 22.

Two-Variance-Component Model Improves Genetic Prediction in Family Datasets.

Am J Hum Genet. 2015 Nov 5;97(5):677-90. doi: 10.1016/j.ajhg.2015.10.002.

Measuring missing heritability: inferring the contribution of common variants.

Proc Natl Acad Sci U S A. 2014 Dec 9;111(49):E5272-81. doi: 10.1073/pnas.1419064111. Epub 2014 Nov 24.

Effective genetic-risk prediction using mixed models.

Am J Hum Genet. 2014 Oct 2;95(4):383-93. doi: 10.1016/j.ajhg.2014.09.007.

Most genetic risk for autism resides with common variation.

Nat Genet. 2014 Aug;46(8):881-5. doi: 10.1038/ng.3039. Epub 2014 Jul 20.

本文引用的文献

DISCUSSION OF: TREELETS-AN ADAPTIVE MULTI-SCALE BASIS FOR SPARSE UNORDERED DATA.

Ann Appl Stat. 2008 Jun;2(2):489-493. doi: 10.1214/07-AOAS137.

Genetic contributions to stability and change in intelligence from childhood to old age.

Nature. 2012 Jan 18;482(7384):212-5. doi: 10.1038/nature10781.

Five years of GWAS discovery.

Am J Hum Genet. 2012 Jan 13;90(1):7-24. doi: 10.1016/j.ajhg.2011.11.029.

Genetic heritability and shared environmental factors among twin pairs with autism.

Arch Gen Psychiatry. 2011 Nov;68(11):1095-102. doi: 10.1001/archgenpsychiatry.2011.76. Epub 2011 Jul 4.

Genome partitioning of genetic variation for complex traits using common SNPs.

Nat Genet. 2011 Jun;43(6):519-25. doi: 10.1038/ng.823. Epub 2011 May 8.

Linkage analysis without defined pedigrees.

Genet Epidemiol. 2011 Jul;35(5):360-70. doi: 10.1002/gepi.20584. Epub 2011 Apr 4.

GCTA: a tool for genome-wide complex trait analysis.

Am J Hum Genet. 2011 Jan 7;88(1):76-82. doi: 10.1016/j.ajhg.2010.11.011. Epub 2010 Dec 17.

Common SNPs explain a large proportion of the heritability for human height.

Nat Genet. 2010 Jul;42(7):565-9. doi: 10.1038/ng.608. Epub 2010 Jun 20.

High-resolution detection of identity by descent in unrelated individuals.

Am J Hum Genet. 2010 Apr 9;86(4):526-39. doi: 10.1016/j.ajhg.2010.02.021. Epub 2010 Mar 18.

Genome-wide association identifies multiple ulcerative colitis susceptibility loci.

Nat Genet. 2010 Apr;42(4):332-7. doi: 10.1038/ng.549. Epub 2010 Mar 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用子树协方差平滑优化遗传推断关系

REFINING GENETICALLY INFERRED RELATIONSHIPS USING TREELET COVARIANCE SMOOTHING.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献