使用正则化回归方法的遗传力估计（HERRA）：适用于连续、二分或发病年龄结局。

Heritability Estimation using a Regularized Regression Approach (HERRA): Applicable to continuous, dichotomous or age-at-onset outcome.

作者信息

Gorfine Malka, Berndt Sonja I, Chang-Claude Jenny, Hoffmeister Michael, Le Marchand Loic, Potter John, Slattery Martha L, Keret Nir, Peters Ulrike, Hsu Li

机构信息

Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel.

Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

出版信息

PLoS One. 2017 Aug 16;12(8):e0181269. doi: 10.1371/journal.pone.0181269. eCollection 2017.

DOI:10.1371/journal.pone.0181269

PMID:28813438

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5559077/

Abstract

The popular Genome-wide Complex Trait Analysis (GCTA) software uses the random-effects models for estimating the narrow-sense heritability based on GWAS data of unrelated individuals without knowing and identifying the causal loci. Many methods have since extended this approach to various situations. However, since the proportion of causal loci among the variants is typically very small and GCTA uses all variants to calculate the similarities among individuals, the estimation of heritability may be unstable, resulting in a large variance of the estimates. Moreover, if the causal SNPs are not genotyped, GCTA sometimes greatly underestimates the true heritability. We present a novel narrow-sense heritability estimator, named HERRA, using well-developed ultra-high dimensional machine-learning methods, applicable to continuous or dichotomous outcomes, as other existing methods. Additionally, HERRA is applicable to time-to-event or age-at-onset outcome, which, to our knowledge, no existing method can handle. Compared to GCTA and LDAK for continuous and binary outcomes, HERRA often has a smaller variance, and when causal SNPs are not genotyped, HERRA has a much smaller empirical bias. We applied GCTA, LDAK and HERRA to a large colorectal cancer dataset using dichotomous outcome (4,312 cases, 4,356 controls, genotyped using Illumina 300K), the respective heritability estimates of GCTA, LDAK and HERRA are 0.068 (SE = 0.017), 0.072 (SE = 0.021) and 0.110 (SE = 5.19 x 10-3). HERRA yields over 50% increase in heritability estimate compared to GCTA or LDAK.

摘要

广受欢迎的全基因组复杂性状分析（GCTA）软件使用随机效应模型，基于无关个体的全基因组关联研究（GWAS）数据来估计狭义遗传力，而无需知晓和识别因果位点。此后，许多方法将这种方法扩展到了各种情况。然而，由于变异中因果位点的比例通常非常小，且GCTA使用所有变异来计算个体间的相似性，遗传力估计可能不稳定，导致估计值的方差很大。此外，如果因果单核苷酸多态性（SNP）未进行基因分型，GCTA有时会大大低估真实的遗传力。我们提出了一种新的狭义遗传力估计方法，名为HERRA，它使用了成熟的超高维机器学习方法，与其他现有方法一样，适用于连续或二分结局。此外，HERRA适用于事件发生时间或发病年龄结局，据我们所知，现有方法无法处理此类情况。与用于连续和二元结局的GCTA和LDAK相比，HERRA的方差通常较小，并且当因果SNP未进行基因分型时，HERRA的经验偏差要小得多。我们将GCTA、LDAK和HERRA应用于一个大型结直肠癌数据集，使用二分结局（4312例病例，4356例对照，使用Illumina 300K进行基因分型），GCTA、LDAK和HERRA各自的遗传力估计值分别为0.068（标准误=0.017）、0.072（标准误=0.021）和0.110（标准误=5.19×10⁻³）。与GCTA或LDAK相比，HERRA的遗传力估计值提高了50%以上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87d5/5559077/7a4f1452aa85/pone.0181269.g001.jpg

相似文献

Heritability Estimation using a Regularized Regression Approach (HERRA): Applicable to continuous, dichotomous or age-at-onset outcome.使用正则化回归方法的遗传力估计（HERRA）：适用于连续、二分或发病年龄结局。

PLoS One. 2017 Aug 16;12(8):e0181269. doi: 10.1371/journal.pone.0181269. eCollection 2017.

GCTA: a tool for genome-wide complex trait analysis.GCTA：一种全基因组复杂性状分析工具。

Am J Hum Genet. 2011 Jan 7;88(1):76-82. doi: 10.1016/j.ajhg.2010.11.011. Epub 2010 Dec 17.

Limitations of GCTA as a solution to the missing heritability problem.全基因组复杂性状分析（GCTA）作为解决“遗传性缺失”问题方法的局限性。

Proc Natl Acad Sci U S A. 2016 Jan 5;113(1):E61-70. doi: 10.1073/pnas.1520109113. Epub 2015 Dec 22.

Evaluating and improving heritability models using summary statistics.使用汇总统计数据评估和改进遗传力模型。

Nat Genet. 2020 Apr;52(4):458-462. doi: 10.1038/s41588-020-0600-y. Epub 2020 Mar 23.

Genetics of callous-unemotional behavior in children.儿童冷酷无情行为的遗传学。

PLoS One. 2013 Jul 9;8(7):e65789. doi: 10.1371/journal.pone.0065789. Print 2013.

Reevaluation of SNP heritability in complex human traits.复杂人类性状中SNP遗传力的重新评估。

Nat Genet. 2017 Jul;49(7):986-992. doi: 10.1038/ng.3865. Epub 2017 May 22.

Methodological Considerations in Estimation of Phenotype Heritability Using Genome-Wide SNP Data, Illustrated by an Analysis of the Heritability of Height in a Large Sample of African Ancestry Adults.利用全基因组SNP数据估计表型遗传力的方法学考量，以对大量非洲裔成年人身高遗传力的分析为例

PLoS One. 2015 Jun 30;10(6):e0131106. doi: 10.1371/journal.pone.0131106. eCollection 2015.

Leveraging population admixture to characterize the heritability of complex traits.利用群体混合来表征复杂性状的遗传力。

Nat Genet. 2014 Dec;46(12):1356-62. doi: 10.1038/ng.3139. Epub 2014 Nov 10.

Finding the missing heritability in pediatric obesity: the contribution of genome-wide complex trait analysis.寻找儿童肥胖症中的遗传缺失：全基因组复杂性状分析的贡献。

Int J Obes (Lond). 2013 Nov;37(11):1506-9. doi: 10.1038/ijo.2013.30. Epub 2013 Mar 26.

Fast and Accurate Construction of Confidence Intervals for Heritability.快速准确地构建遗传力的置信区间

Am J Hum Genet. 2016 Jun 2;98(6):1181-1192. doi: 10.1016/j.ajhg.2016.04.016.

引用本文的文献

Inferring the heritability of bacterial traits in the era of machine learning.在机器学习时代推断细菌性状的遗传性。

Bioinform Adv. 2023 Mar 14;3(1):vbad027. doi: 10.1093/bioadv/vbad027. eCollection 2023.

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure.充分利用 SNP 阵列：提取潜在基因组结构的工具的系统评价。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac043.

Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting.提高遗传力：使用多次样本拆分估计表型变异的遗传成分。

BMC Bioinformatics. 2021 Mar 27;22(1):164. doi: 10.1186/s12859-021-04079-7.

How Can Gene-Expression Information Improve Prognostic Prediction in TCGA Cancers: An Empirical Comparison Study on Regularization and Mixed Cox Models.基因表达信息如何改善TCGA癌症中的预后预测：正则化和混合Cox模型的实证比较研究

Front Genet. 2020 Aug 21;11:920. doi: 10.3389/fgene.2020.00920. eCollection 2020.

Machine learning identifies interacting genetic variants contributing to breast cancer risk: A case study in Finnish cases and controls.机器学习鉴定出导致乳腺癌风险的相互作用遗传变异：芬兰病例对照研究。

Sci Rep. 2018 Sep 3;8(1):13149. doi: 10.1038/s41598-018-31573-5.

Another Round of "Clue" to Uncover the Mystery of Complex Traits.新一轮“线索”揭示复杂性状之谜。

Genes (Basel). 2018 Jan 25;9(2):61. doi: 10.3390/genes9020061.

本文引用的文献

Simultaneous discovery, estimation and prediction analysis of complex traits using a bayesian mixture model.使用贝叶斯混合模型对复杂性状进行同时发现、估计和预测分析。

PLoS Genet. 2015 Apr 7;11(4):e1004969. doi: 10.1371/journal.pgen.1004969. eCollection 2015 Apr.

Measuring missing heritability: inferring the contribution of common variants.测量缺失的遗传力：推断常见变异的贡献。

Proc Natl Acad Sci U S A. 2014 Dec 9;111(49):E5272-81. doi: 10.1073/pnas.1419064111. Epub 2014 Nov 24.

Defining the role of common variation in the genomic and biological architecture of adult human height.确定常见变异在成年人类身高的基因组和生物学结构中的作用。

Nat Genet. 2014 Nov;46(11):1173-86. doi: 10.1038/ng.3097. Epub 2014 Oct 5.

Estimating the heritability of colorectal cancer.估计结直肠癌的遗传度。

Hum Mol Genet. 2014 Jul 15;23(14):3898-905. doi: 10.1093/hmg/ddu087. Epub 2014 Feb 21.

Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits.利用扩展谱系来估计 23 个定量和二分性状的遗传组成部分。

PLoS Genet. 2013 May;9(5):e1003520. doi: 10.1371/journal.pgen.1003520. Epub 2013 May 30.

Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies.基于全基因组关联研究的多基因分析预测风险的性能。

Nat Genet. 2013 Apr;45(4):400-5, 405e1-3. doi: 10.1038/ng.2579. Epub 2013 Mar 3.

Estimating heritability for cause specific mortality based on twin studies.基于双胞胎研究估计特定病因死亡率的遗传力。

Lifetime Data Anal. 2014 Apr;20(2):210-33. doi: 10.1007/s10985-013-9244-x. Epub 2013 Feb 2.

Improved heritability estimation from genome-wide SNPs.提高全基因组 SNP 遗传力估计值。

Am J Hum Genet. 2012 Dec 7;91(6):1011-21. doi: 10.1016/j.ajhg.2012.10.010.

Variance estimation using refitted cross-validation in ultrahigh dimensional regression.超高维回归中使用重新拟合交叉验证的方差估计

J R Stat Soc Series B Stat Methodol. 2012 Jan 1;74(1):37-65. doi: 10.1111/j.1467-9868.2011.01005.x.

Rare and common variants: twenty arguments.罕见和常见变异体：二十个论点。

Nat Rev Genet. 2012 Jan 18;13(2):135-45. doi: 10.1038/nrg3118.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用正则化回归方法的遗传力估计（HERRA）：适用于连续、二分或发病年龄结局。

Heritability Estimation using a Regularized Regression Approach (HERRA): Applicable to continuous, dichotomous or age-at-onset outcome.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献