Suppr超能文献

应用帕累托法则进行全基因组育种值估计。

Using the Pareto principle in genome-wide breeding value estimation.

机构信息

Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, 1432 Ås, Norway.

出版信息

Genet Sel Evol. 2011 Nov 1;43(1):35. doi: 10.1186/1297-9686-43-35.

Abstract

Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis.

摘要

全基因组育种值估计方法可根据标记效应的先验分布假设进行分类。全基因组 BLUP 方法假设所有标记的效应呈正态分布,方差恒定,计算速度快。在贝叶斯方法中,应用了更灵活的 SNP 效应先验分布,允许 SNP 效应非常大,尽管大多数 SNP 效应较小甚至为零,但这些先验分布通常也需要大量计算,因为它们依赖于蒙特卡罗马尔可夫链抽样。在这项研究中,我们采用了帕累托原理对可用的标记位点进行加权,即我们认为 x%的位点解释了(100-x)%的总遗传方差。假设这个原理,也可以定义“大”和“小”SNP 的先验分布的方差。相对较少的大 SNP 解释了很大一部分遗传方差,而大多数 SNP 效应较小,只解释了遗传方差的一小部分。我们将这种方法命名为 MixP,其中先验分布是两个正态分布的混合,即一个具有较大方差,另一个具有较小方差。使用真实的挪威红牛 pedigree 进行模拟结果表明,在所有研究案例中,MixP 至少与其他方法一样准确。该方法还将先验分布的超参数从 2(具有大效应的 SNP 的比例和方差)减少到 1(具有大效应的 SNP 的比例),假设总遗传方差是已知的。正态分布混合先验使得可以迭代求解方程,这大大减少了两个数量级的计算负荷。在标记密度达到百万(s)和全基因组序列数据的时代,MixP 为分析提供了一种可行的计算方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6600/3354342/092202546d1c/1297-9686-43-35-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验