Suppr超能文献

从具有不同亲缘关系程度的参考群体中估计基因组预测准确性。

Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship.

作者信息

Lee S Hong, Clark Sam, van der Werf Julius H J

机构信息

Centre for Population Health Research, School of Health Sciences and Sansom Institute of Health Research, University of South Australia, Adelaide, South Australia, Australia.

School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia.

出版信息

PLoS One. 2017 Dec 21;12(12):e0189775. doi: 10.1371/journal.pone.0189775. eCollection 2017.

Abstract

Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as 'unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Ne). Both the effective number of chromosome segments (Me) and Ne are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.

摘要

基因组预测正在广泛的领域中兴起,包括动植物育种、人类精准医学中的风险预测以及法医学。当参考数据由与目标个体具有不同程度关系的信息源组成时,建立一个关于基因组预测准确性的理论框架是很有必要的。在基因组预测中,一个参考集可以包含近亲、远亲以及来自更广泛人群的“无关”个体。各种信息源被建模为具有不同有效种群大小(Ne)的不同群体。有效染色体片段数(Me)和Ne都被视为用于预测的数据的函数。我们通过对模拟数据和真实数据的分析来验证我们的理论,并表明与目标的基因组关系的变化是参考集信息含量的一个预测指标。在每个信息源可获得的数据量相似的情况下,我们表明近亲对基因组预测准确性的影响可能比关系较远的个体大得多。我们还表明,当预测依赖于近亲时,随着训练数据或标记面板密度的增加,预测准确性的提高幅度较小。我们发布了一款软件,该软件可以在组合与目标具有不同程度关系的不同参考源时估计预期的预测准确性和效能,这在动植物和人类遗传学中规划基因组预测(在收集数据之前或之后)时很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c98e/5739427/7a8d1b3d1b5f/pone.0189775.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验