Suppr超能文献

稀有和低频序列变异对奶牛基因组预测可靠性的影响。

Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle.

机构信息

Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark.

Wageningen University and Research, Animal Breeding and Genomics, Wageningen, The Netherlands.

出版信息

Genet Sel Evol. 2018 Nov 20;50(1):62. doi: 10.1186/s12711-018-0432-8.

Abstract

BACKGROUND

Availability of whole-genome sequence data for a large number of cattle and efficient imputation methodologies open a new opportunity to include rare and low-frequency variants (RLFV) in genomic prediction in dairy cattle. The objective of this study was to examine the impact of including RLFV that are within genes and selected from whole-genome sequence variants, on the reliability of genomic prediction for fertility, health and longevity in dairy cattle.

RESULTS

All genic RLFV with a minor allele frequency lower than 0.05 were extracted from imputed sequence data and subsets were created using different strategies. These subsets were subsequently combined with Illumina 50 k single nucleotide polymorphism (SNP) data and used for genomic prediction. Reliability of prediction obtained by using 50 k SNP data alone was used as reference value and absolute changes in reliabilities are referred to as changes in percentage points. Adding a component that included either all the genic or a subset of selected RLFV into the model in addition to the 50 k component changed the reliability of predictions by - 2.2 to 1.1%, i.e. hardly no change in reliability of prediction was found, regardless of how the RLFV were selected. In addition to these empirical analyses, a simulation study was performed to evaluate the potential impact of adding RLFV in the model on the reliability of prediction. Three sets of causal RLFV (containing 21,468, 1348 and 235 RLFV) that were randomly selected from different numbers of genes were generated and accounted for 10% additional genetic variance of the estimated variance explained by the 50 k SNPs. When genic RLFV based on mapping results were included in the prediction model, reliabilities improved by up to 4.0% and when the causal RLFV were included they improved by up to 6.8%.

CONCLUSIONS

Using selected RLFV from whole-genome sequence data had only a small impact on the empirical reliability of genomic prediction in dairy cattle. Our simulations revealed that for sequence data to bring a benefit, the key is to identify causal RLFV.

摘要

背景

大量牛的全基因组序列数据的可用性和高效的插补方法为在奶牛基因组预测中纳入罕见和低频变异(RLFV)提供了新的机会。本研究的目的是检验在基因组预测奶牛的生育力、健康和寿命时纳入位于基因内和从全基因组序列变异中选择的 RLFV 的影响。

结果

从已推断的序列数据中提取了所有频率低于 0.05 的基因内 RLFV,并使用不同策略创建了子集。这些子集随后与 Illumina 50k 单核苷酸多态性(SNP)数据结合,用于基因组预测。单独使用 50k SNP 数据获得的预测可靠性用作参考值,可靠性的绝对变化被称为百分点变化。在模型中添加除 50k 组件之外的包含所有基因或选择的 RLFV 子集的组件,将预测的可靠性改变了-2.2%至 1.1%,即几乎没有发现预测可靠性的变化,无论如何选择 RLFV。除了这些实证分析之外,还进行了一项模拟研究,以评估在模型中添加 RLFV 对预测可靠性的潜在影响。从不同数量的基因中随机选择了三个包含 21468、1348 和 235 个 RLFV 的因果 RLFV 集,并解释了 50kSNP 解释的估计方差的 10%额外遗传方差。当基于映射结果的基因内 RLFV 被包括在预测模型中时,可靠性提高了高达 4.0%,而当包括因果 RLFV 时,可靠性提高了高达 6.8%。

结论

使用全基因组序列数据中的选择 RLFV 对奶牛基因组预测的经验可靠性仅有很小的影响。我们的模拟表明,为了使序列数据带来益处,关键是要识别因果 RLFV。

相似文献

本文引用的文献

1
Human-Mediated Introgression of Haplotypes in a Modern Dairy Cattle Breed.人类介导的现代奶牛品种单倍型渗入。
Genetics. 2018 Aug;209(4):1305-1317. doi: 10.1534/genetics.118.301143. Epub 2018 May 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验