Suppr超能文献

REML 多变量遗传动物模型估计中的 Copula 误指定。

Copula miss-specification in REML multivariate genetic animal model estimation.

机构信息

GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France.

Université Paris-Saclay, INRAE, AgroParisTech, GABI, Jouy-en-Josas, France.

出版信息

Genet Sel Evol. 2022 May 26;54(1):36. doi: 10.1186/s12711-022-00729-3.

Abstract

BACKGROUND

In animal genetics, linear mixed models are used to deal with genetic and environmental effects. The variance and covariance terms of these models are usually estimated by restricted maximum likelihood (REML), which provides unbiased estimators. A strong hypothesis of REML estimation is the multi-normality of the response variables. However, in practice, even if the marginal distributions of each phenotype are normal, the multi-normality assumption may be violated by non-normality of the cross-sectional dependence structure, that is to say when the copula of the multivariate distribution is not Gaussian. This study uses simulations to evaluate the impact of copula miss-specification in a bivariate animal model on REML estimations of variance components.

RESULT

Bivariate phenotypes were simulated for populations undergoing selection, considering different copulas for the dependence structure between the error components. Two multi-trait situations were considered: two phenotypes were measured on the selection candidates, or only one phenotype was measured on the selection candidates. Three generations with random selection and five generations with truncation selection based on estimated breeding values were simulated. When selection was performed at random, no significant differences were observed between the REML estimations of variance components and the true parameters even for the non-Gaussian distributions. For the truncation selections, when two phenotypes were measured on candidates, biases were systematically observed in the variance components for high residual dependence in the case of non-Gaussian distributions, especially in the case of a heavy-tailed or asymmetric distribution when the two traits were measured. Conversely, when only one phenotype was measured on candidates, no difference was observed between the Gaussian and non-Gaussian distributions in REML estimations.

CONCLUSIONS

This study confirms that REML can be used by geneticists to evaluate breeding values in the multivariate case even if the multivariate phenotypes deviate from normality in the situation of random selection or if one trait is not measured for the candidate under selection. Nevertheless, when the two traits are measured, the violation of the normality assumption may lead to non-negligible biases in the REML estimations of the variance-covariance components.

摘要

背景

在动物遗传学中,线性混合模型用于处理遗传和环境效应。这些模型的方差和协方差项通常通过约束最大似然法(REML)进行估计,该方法提供了无偏估计量。REML 估计的一个强有力假设是响应变量的多元正态性。然而,在实践中,即使每个表型的边缘分布是正态的,多元正态性假设也可能因横截面相关性结构的非正态性而被违反,也就是说,当多元分布的 Copula 不是正态分布时。本研究通过模拟来评估二元动物模型中 Copula 误设定对方差分量 REML 估计的影响。

结果

对经历选择的群体模拟了二元表型,考虑了误差分量之间相关性结构的不同 Copula。考虑了两种多性状情况:在选择候选者上测量了两个表型,或仅在选择候选者上测量了一个表型。模拟了具有随机选择的三代和基于估计育种值的截断选择五代。当随机进行选择时,即使对于非正态分布,方差分量的 REML 估计值与真实参数之间也没有观察到显著差异。对于截断选择,当在候选者上测量两个表型时,在非正态分布的情况下,当残差相关性高时,方差分量会出现系统偏差,尤其是当两个性状测量时具有重尾或非对称分布时。相反,当仅在候选者上测量一个表型时,在 REML 估计中,正态和非正态分布之间没有观察到差异。

结论

本研究证实,即使在随机选择的情况下,多元表型偏离正态性,或者在选择的候选者上未测量一个性状的情况下,遗传学家也可以使用 REML 来评估育种值。然而,当测量两个性状时,正态性假设的违反可能导致方差协方差分量的 REML 估计出现不可忽略的偏差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4206/9137146/e036c670b7e1/12711_2022_729_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验