Suppr超能文献

复杂性状的预测:最佳线性无偏预测的稳健替代方法

Prediction of Complex Traits: Robust Alternatives to Best Linear Unbiased Prediction.

作者信息

Gianola Daniel, Cecchinato Alessio, Naya Hugo, Schön Chris-Carolin

机构信息

Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, United States.

Department of Dairy Science, University of Wisconsin-Madison, Madison, WI, United States.

出版信息

Front Genet. 2018 Jun 5;9:195. doi: 10.3389/fgene.2018.00195. eCollection 2018.

Abstract

A widely used method for prediction of complex traits in animal and plant breeding is "genomic best linear unbiased prediction" (GBLUP). In a quantitative genetics setting, BLUP is a linear regression of phenotypes on a pedigree or on a genomic relationship matrix, depending on the type of input information available. Normality of the distributions of random effects and of model residuals is not required for BLUP but a Gaussian assumption is made implicitly. A potential downside is that Gaussian linear regressions are sensitive to outliers, genetic or environmental in origin. We present simple (relative to a fully Bayesian analysis) to implement robust alternatives to BLUP using a linear model with residual or Laplace distributions instead of a Gaussian one, and evaluate the methods with milk yield records on Italian Brown Swiss cattle, grain yield data in inbred wheat lines, and using three traits measured on accessions of . The methods do not use Markov chain Monte Carlo sampling and model hyper-parameters, viewed here as regularization "knobs," are tuned via some cross-validation. Uncertainty of predictions are evaluated by employing bootstrapping or by random reconstruction of training and testing sets. It was found (e.g., test-day milk yield in cows, flowering time and FRIGIDA expression in ) that the best predictions were often those obtained with the robust methods. The results obtained are encouraging and stimulate further investigation and generalization.

摘要

在动植物育种中,一种广泛使用的复杂性状预测方法是“基因组最佳线性无偏预测”(GBLUP)。在数量遗传学背景下,根据可用输入信息的类型,BLUP是表型对系谱或基因组关系矩阵的线性回归。BLUP不需要随机效应和模型残差分布的正态性,但隐含地做出了高斯假设。一个潜在的缺点是高斯线性回归对异常值敏感,这些异常值可能源于遗传或环境因素。我们提出了简单的(相对于完全贝叶斯分析)方法,使用具有残差或拉普拉斯分布而非高斯分布的线性模型来实现BLUP的稳健替代方法,并使用意大利褐牛的产奶量记录、近交小麦品系的谷物产量数据以及对……的种质所测量的三个性状来评估这些方法。这些方法不使用马尔可夫链蒙特卡罗抽样,并且将模型超参数(在此视为正则化“旋钮”)通过一些交叉验证进行调整。预测的不确定性通过自举法或通过随机重建训练集和测试集来评估。结果发现(例如,奶牛的测定日产奶量、……的开花时间和FRIGIDA表达),最佳预测往往是通过稳健方法获得的。所获得的结果令人鼓舞,并激发了进一步的研究和推广。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f14d/6008589/e6e9fad97b47/fgene-09-00195-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验