Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430, Guadalajara, JAL, Mexico.
Facultad de Telemática, Universidad de Colima, Colima, Mexico.
Theor Appl Genet. 2019 May;132(5):1587-1606. doi: 10.1007/s00122-019-03303-6. Epub 2019 Feb 12.
Current genome-enabled prediction models assumed errors normally distributed, which are sensitive to outliers. We propose a model with errors assumed to follow a Laplace distribution to deal better with outliers. Current genome-enabled prediction models use regressions that fit the expected value (mean) of a response variable with errors assumed normally distributed, which are often sensitive to outliers, either genetic or environmental. For this reason, we propose a robust Bayesian genome median regression (BGMR) model that fits regressions to the medians of a distribution, with errors assumed to follow a Laplace distribution to deal better with outliers. The BGMR model was evaluated under a Bayesian framework with Markov Chain Monte Carlo sampling using a location-scale mixture representation of the Laplace distribution. The BGMR was implemented with two simulated and two real genomic data sets, and we compared its prediction performance with that of a conventional genomic best linear unbiased prediction (GBLUP) model and the Laplace maximum a posteriori (LMAP) method. The prediction accuracies of BGMR were higher than those of the GBLUP and LMAP methods when there were outliers. The BGMR model could be useful to breeders who need to predict and select genotypes based on data with unknown outliers.
目前基于基因组的预测模型假设误差呈正态分布,对异常值很敏感。我们提出了一种假设误差服从拉普拉斯分布的模型,以更好地处理异常值。目前基于基因组的预测模型使用回归来拟合具有正态分布误差的响应变量的期望值(平均值),这些模型通常对遗传或环境异常值很敏感。因此,我们提出了一种稳健的贝叶斯基因组中位数回归(BGMR)模型,该模型使用拉普拉斯分布的中位数拟合回归,以更好地处理异常值。BGMR 模型在贝叶斯框架下进行了评估,使用拉普拉斯分布的位置-尺度混合表示进行了马尔可夫链蒙特卡罗采样。BGMR 是在两个模拟和两个真实基因组数据集上实现的,我们将其预测性能与传统的基因组最佳线性无偏预测(GBLUP)模型和拉普拉斯最大后验(LMAP)方法进行了比较。当存在异常值时,BGMR 的预测准确性高于 GBLUP 和 LMAP 方法。对于需要根据未知异常值的数据进行预测和选择基因型的育种者来说,BGMR 模型可能很有用。