使用远亲个体数据的收缩和变量选择方法对复杂人类性状进行预测的有效性。

Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals.

作者信息

Berger Swetlana, Pérez-Rodríguez Paulino, Veturi Yogasudha, Simianer Henner, de los Campos Gustavo

机构信息

Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University Goettingen, Albrecht-Thaer-Weg 3, Goettingen, Germany.

出版信息

Ann Hum Genet. 2015 Mar;79(2):122-35. doi: 10.1111/ahg.12099. Epub 2015 Jan 20.

DOI:10.1111/ahg.12099

PMID:25600682

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4428155/

Abstract

Genome-wide association studies (GWAS) have detected large numbers of variants associated with complex human traits and diseases. However, the proportion of variance explained by GWAS-significant single nucleotide polymorphisms has been usually small. This brought interest in the use of whole-genome regression (WGR) methods. However, there has been limited research on the factors that affect prediction accuracy (PA) of WGRs when applied to human data of distantly related individuals. Here, we examine, using real human genotypes and simulated phenotypes, how trait complexity, marker-quantitative trait loci (QTL) linkage disequilibrium (LD), and the model used affect the performance of WGRs. Our results indicated that the estimated rate of missing heritability is dependent on the extent of marker-QTL LD. However, this parameter was not greatly affected by trait complexity. Regarding PA our results indicated that: (a) under perfect marker-QTL LD WGR can achieve moderately high prediction accuracy, and with simple genetic architectures variable selection methods outperform shrinkage procedures and (b) under imperfect marker-QTL LD, variable selection methods can achieved reasonably good PA with simple or moderately complex genetic architectures; however, the PA of these methods deteriorated as trait complexity increases and with highly complex traits variable selection and shrinkage methods both performed poorly. This was confirmed with an analysis of human height.

摘要

全基因组关联研究（GWAS）已经检测到大量与复杂人类性状和疾病相关的变异。然而，GWAS显著的单核苷酸多态性所解释的方差比例通常较小。这引发了人们对使用全基因组回归（WGR）方法的兴趣。然而，当将WGR应用于远亲个体的人类数据时，关于影响其预测准确性（PA）的因素的研究却很有限。在这里，我们使用真实的人类基因型和模拟表型，研究性状复杂性、标记-数量性状位点（QTL）连锁不平衡（LD）以及所使用的模型如何影响WGR的性能。我们的结果表明，估计的缺失遗传率取决于标记-QTL LD的程度。然而，该参数受性状复杂性的影响不大。关于PA，我们的结果表明：（a）在完美的标记-QTL LD下，WGR可以实现适度较高的预测准确性，并且在简单的遗传结构下，变量选择方法优于收缩程序；（b）在不完美的标记-QTL LD下，变量选择方法在简单或中等复杂的遗传结构下可以实现相当好的PA；然而，随着性状复杂性增加，这些方法的PA会恶化，并且对于高度复杂的性状，变量选择和收缩方法的表现都很差。对人类身高的分析证实了这一点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c15b/4964930/102230408af1/AHG-79-122-g001.jpg

相似文献

Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals.使用远亲个体数据的收缩和变量选择方法对复杂人类性状进行预测的有效性。

Ann Hum Genet. 2015 Mar;79(2):122-35. doi: 10.1111/ahg.12099. Epub 2015 Jan 20.

Prediction of complex human traits using the genomic best linear unbiased predictor.利用基因组最佳线性无偏预测器预测复杂人类特征。

PLoS Genet. 2013;9(7):e1003608. doi: 10.1371/journal.pgen.1003608. Epub 2013 Jul 11.

Genome-wide prediction of traits with different genetic architecture through efficient variable selection.通过有效的变量选择对具有不同遗传结构的性状进行全基因组预测。

Genetics. 2013 Oct;195(2):573-87. doi: 10.1534/genetics.113.150078. Epub 2013 Aug 9.

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.基于在纯杜洛克群体中对低覆盖度全基因组序列变异体进行选择性连锁不平衡修剪的基因组预测。

Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w.

Using selection index theory to estimate consistency of multi-locus linkage disequilibrium across populations.利用选择指数理论估计多基因座连锁不平衡在不同群体间的一致性。

BMC Genet. 2015 Jul 19;16:87. doi: 10.1186/s12863-015-0252-6.

Impact of QTL minor allele frequency on genomic evaluation using real genotype data and simulated phenotypes in Japanese Black cattle.日本黑牛中数量性状基因座（QTL）次要等位基因频率对使用真实基因型数据和模拟表型进行基因组评估的影响。

BMC Genet. 2015 Nov 19;16:134. doi: 10.1186/s12863-015-0287-8.

Assessing the value of phenotypic information from non-genotyped animals for QTL mapping of complex traits in real and simulated populations.评估来自非基因分型动物的表型信息在真实和模拟群体中对复杂性状进行QTL定位的价值。

BMC Genet. 2016 Jun 21;17(1):89. doi: 10.1186/s12863-016-0394-1.

Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction.连锁不平衡和共分离信息对基因组预测准确性的贡献。

Genet Sel Evol. 2016 Oct 11;48(1):77. doi: 10.1186/s12711-016-0255-4.

Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle.利用真实或推算的全基因组标记预测牛模拟多基因表型及其潜在数量性状位点基因型的准确性。

Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.

An efficient unified model for genome-wide association studies and genomic selection.一种用于全基因组关联研究和基因组选择的高效统一模型。

Genet Sel Evol. 2017 Aug 24;49(1):64. doi: 10.1186/s12711-017-0338-x.

引用本文的文献

Genetic Parameter and Hyper-Parameter Estimation Underlie Nitrogen Use Efficiency in Bread Wheat.遗传参数和超参数估计是面包小麦氮利用效率的基础。

Int J Mol Sci. 2023 Sep 19;24(18):14275. doi: 10.3390/ijms241814275.

From Genotype to Phenotype: Polygenic Prediction of Complex Human Traits.从基因型到表型：复杂人类性状的多基因预测

Methods Mol Biol. 2022;2467:421-446. doi: 10.1007/978-1-0716-2205-6_15.

Opportunities and limits of combining microbiome and genome data for complex trait prediction.将微生物组和基因组数据相结合进行复杂性状预测的机会和限制。

Genet Sel Evol. 2021 Aug 6;53(1):65. doi: 10.1186/s12711-021-00658-7.

A Bayesian linear mixed model for prediction of complex traits.一种用于复杂性状预测的贝叶斯线性混合模型。

Bioinformatics. 2021 Apr 1;36(22-23):5415-5423. doi: 10.1093/bioinformatics/btaa1023.

Marker Selection in Multivariate Genomic Prediction Improves Accuracy of Low Heritability Traits.多变量基因组预测中的标记选择提高了低遗传力性状的准确性。

Front Genet. 2020 Oct 30;11:499094. doi: 10.3389/fgene.2020.499094. eCollection 2020.

Investigation of prediction accuracy and the impact of sample size, ancestry, and tissue in transcriptome-wide association studies.转录组关联研究中预测准确性的调查及样本量、祖源和组织的影响。

Genet Epidemiol. 2020 Jul;44(5):425-441. doi: 10.1002/gepi.22290. Epub 2020 Mar 19.

Ann Hum Genet. 2018 Mar;82(2):127. doi: 10.1111/ahg.12243.

BMC Genet. 2015 Nov 19;16:134. doi: 10.1186/s12863-015-0287-8.

Application of high-dimensional feature selection: evaluation for genomic prediction in man.高维特征选择的应用：人类基因组预测评估

Sci Rep. 2015 May 19;5:10312. doi: 10.1038/srep10312.

本文引用的文献

Genome-wide regression and prediction with the BGLR statistical package.使用BGLR统计软件包进行全基因组回归与预测。

Genetics. 2014 Oct;198(2):483-95. doi: 10.1534/genetics.114.164442. Epub 2014 Jul 9.

Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies.利用全基因组关联研究结果提高复杂性状全基因组预测的准确性。

PLoS One. 2014 Mar 24;9(3):e93017. doi: 10.1371/journal.pone.0093017. eCollection 2014.

A function accounting for training set size and marker density to model the average accuracy of genomic prediction.一个考虑训练集大小和标记密度以对基因组预测的平均准确性进行建模的函数。

PLoS One. 2013 Dec 5;8(12):e81046. doi: 10.1371/journal.pone.0081046. eCollection 2013.

Genome-wide prediction of traits with different genetic architecture through efficient variable selection.通过有效的变量选择对具有不同遗传结构的性状进行全基因组预测。

Genetics. 2013 Oct;195(2):573-87. doi: 10.1534/genetics.113.150078. Epub 2013 Aug 9.

Prediction of complex human traits using the genomic best linear unbiased predictor.利用基因组最佳线性无偏预测器预测复杂人类特征。

PLoS Genet. 2013;9(7):e1003608. doi: 10.1371/journal.pgen.1003608. Epub 2013 Jul 11.

Model comparison on genomic predictions using high-density markers for different groups of bulls in the Nordic Holstein population.基于高密度标记对北欧荷斯坦牛不同群体进行基因组预测的模型比较。

J Dairy Sci. 2013 Jul;96(7):4678-87. doi: 10.3168/jds.2012-6406. Epub 2013 May 6.

Priors in whole-genome regression: the bayesian alphabet returns.全基因组回归中的先验信息：贝叶斯字母表回归。

Genetics. 2013 Jul;194(3):573-96. doi: 10.1534/genetics.113.151753. Epub 2013 May 1.

Polygenic modeling with bayesian sparse linear mixed models.贝叶斯稀疏线性混合模型的多基因建模。

PLoS Genet. 2013;9(2):e1003264. doi: 10.1371/journal.pgen.1003264. Epub 2013 Feb 7.

Improved heritability estimation from genome-wide SNPs.提高全基因组 SNP 遗传力估计值。

Am J Hum Genet. 2012 Dec 7;91(6):1011-21. doi: 10.1016/j.ajhg.2012.10.010.

Whole-genome regression and prediction methods applied to plant and animal breeding.全基因组回归和预测方法在动植物育种中的应用。

Genetics. 2013 Feb;193(2):327-45. doi: 10.1534/genetics.112.143313. Epub 2012 Jun 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用远亲个体数据的收缩和变量选择方法对复杂人类性状进行预测的有效性。

Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献