Suppr超能文献

使用最大似然法、拟期望极大化最大似然法和关系增量法估计跨品种和品种内元育种值的基因组关系。

Estimating genomic relationships of metafounders across and within breeds using maximum likelihood, pseudo-expectation-maximization maximum likelihood and increase of relationships.

机构信息

CDCB, 4201 Northview Drive, Bowie, MD, 20716, USA.

Animal and Dairy Science, University of Georgia, 425 River Rd, Athens, GA, 30602, USA.

出版信息

Genet Sel Evol. 2024 May 2;56(1):35. doi: 10.1186/s12711-024-00892-9.

Abstract

BACKGROUND

The theory of "metafounders" proposes a unified framework for relationships across base populations within breeds (e.g. unknown parent groups), and base populations across breeds (crosses) together with a sensible compatibility with genomic relationships. Considering metafounders might be advantageous in pedigree best linear unbiased prediction (BLUP) or single-step genomic BLUP. Existing methods to estimate relationships across metafounders are not well adapted to highly unbalanced data, genotyped individuals far from base populations, or many unknown parent groups (within breed per year of birth).

METHODS

We derive likelihood methods to estimate . For a single metafounder, summary statistics of pedigree and genomic relationships allow deriving a cubic equation with the real root being the maximum likelihood (ML) estimate of . This equation is tested with Lacaune sheep data. For several metafounders, we split the first derivative of the complete likelihood in a term related to , and a second term related to Mendelian sampling variances. Approximating the first derivative by its first term results in a pseudo-EM algorithm that iteratively updates the estimate of by the corresponding block of the H-matrix. The method extends to complex situations with groups defined by year of birth, modelling the increase of using estimates of the rate of increase of inbreeding ( ), resulting in an expanded and in a pseudo-EM+ algorithm. We compare these methods with the generalized least squares (GLS) method using simulated data: complex crosses of two breeds in equal or unsymmetrical proportions; and in two breeds, with 10 groups per year of birth within breed. We simulate genotyping in all generations or in the last ones.

RESULTS

For a single metafounder, the ML estimates of the Lacaune data corresponded to the maximum. For simulated data, when genotypes were spread across all generations, both GLS and pseudo-EM(+ ) methods were accurate. With genotypes only available in the most recent generations, the GLS method was biased, whereas the pseudo-EM(+ ) approach yielded more accurate and unbiased estimates.

CONCLUSIONS

We derived ML, pseudo-EM and pseudo-EM+ methods to estimate in many realistic settings. Estimates are accurate in real and simulated data and have a low computational cost.

摘要

背景

“元 founders”理论为品种内基础群体(例如未知亲本组群)以及品种间基础群体(杂交)之间的关系提供了一个统一的框架,并与基因组关系具有合理的兼容性。考虑元 founders 可能在系谱最佳线性无偏预测(BLUP)或单步基因组 BLUP 中具有优势。现有的估计元 founders 之间关系的方法不太适用于高度不平衡的数据、远离基础群体的基因型个体或许多未知亲本组群(每年出生的一个品种)。

方法

我们推导出估计的似然方法。对于单个元 founders,系谱和基因组关系的汇总统计信息允许推导出一个三次方程,其实根是最大似然(ML)估计值。该方程在 Lacaune 绵羊数据中进行了测试。对于多个元 founders,我们将完整似然函数的一阶导数分为与相关的项和与 Mendelian 抽样方差相关的第二项。通过将一阶导数近似为第一项,得到一个伪 EM 算法,该算法通过 H 矩阵的相应块迭代更新的估计值。该方法扩展到具有由出生年份定义的组的复杂情况,通过估计近交增加率()来建模的增加,从而得到扩展的和伪 EM+算法。我们使用模拟数据比较了这些方法与广义最小二乘法(GLS):两种品种的复杂杂交,比例相等或不对称;以及两种品种,每个品种每年有 10 个组。我们模拟了在所有世代或最后几代中的基因分型。

结果

对于单个元 founders,Lacaune 数据的 ML 估计值对应于最大值。对于模拟数据,当基因型分布在所有世代中时,GLS 和伪 EM(+ )方法都是准确的。当只有最新世代的基因型可用时,GLS 方法存在偏差,而伪 EM(+ )方法则产生更准确和无偏差的估计值。

结论

我们在许多现实情况下推导出了估计的 ML、伪 EM 和伪 EM+方法。在真实和模拟数据中,估计值准确且计算成本低。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06af/11536831/afd113318baa/12711_2024_892_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验