INRA, GenPhySE, Castanet-Tolosan 31320, France; Facultad de Veterinaria, Universidad de la República, 11600 Montevideo, Uruguay.
CSIRO Agriculture and Food, St. Lucia 4067, Australia.
J Dairy Sci. 2020 Jan;103(1):529-544. doi: 10.3168/jds.2019-16603. Epub 2019 Nov 6.
Bias in genetic evaluations has been a constant concern in animal genetics. The interest in this topic has increased in the last years, since many studies have detected overestimation (bias) in estimated breeding values (EBV). Detecting the existence of bias, and the realized accuracy of predictions, is therefore of importance, yet this is difficult when studying small data sets or breeds. In this study, we tested by simulation the recently presented method Linear Regression (LR) for estimation of bias, slope, and accuracy of pedigree EBV. The LR method computes statistics by comparing EBV from a data set containing old, partial information with EBV from a data set containing all information (old and new, a whole data set) for the same individuals. The method proposes an estimator for bias (Δˆ), an estimator of slope (bˆ), and 3 estimators related to accuracies: the ratio between accuracies [Formula: see text] the reliability of the partial data set (accˆ), and the ratio of reliabilities (ρˆ). We simulated a dairy scheme for low (0.10) and moderate (0.30) heritabilities. In both cases, we checked the behavior of the estimators for 3 scenarios: (1) when the evaluation model is the same as the model used to simulate the data; (2) when the evaluation model uses an incorrect heritability; and (3) when the data includes an environmental trend. For scenarios in which the evaluation model was correct, the LR method was capable of correctly estimating bias, slope, and accuracies, with better performance for higher heritability [i.e., corr(b,bˆ) was 0.45 for h = 0.10 and 0.59 for h = 0.30]. In cases of the use of incorrect heritabilities in the evaluation model, the bias was correctly estimated in direction but not in magnitude. In the same way, the magnitudes of bias and of slope were underestimated in scenarios with environmental trends in data, except for cases in which contemporary groups were random and greatly shrunken. In general, accuracies were well estimated in all scenarios. The LR method is capable of checking bias and accuracy in all cases, if the evaluation model is reasonably correct or robust, and its estimations are more precise with more information (e.g., high heritability). If the model uses an incorrect heritability or a hidden trend exists in the data, it is still possible to estimate the direction and existence of bias and slope but not always their magnitudes.
遗传评估中的偏差一直是动物遗传学中的一个关注点。近年来,由于许多研究都发现了估计育种值(EBV)的高估(偏差),因此人们对这个话题的兴趣有所增加。因此,检测偏差的存在以及预测的实际准确性非常重要,但在研究小数据集或品种时,这是困难的。在这项研究中,我们通过模拟测试了最近提出的用于估计偏差、斜率和系谱 EBV 准确性的线性回归(LR)方法。LR 方法通过比较包含旧的、部分信息的数据集中的 EBV 与包含所有信息(旧的和新的,整个数据集)的同一个体的数据集中的 EBV 来计算统计数据。该方法为偏差(Δˆ)、斜率(bˆ)和 3 个与准确性相关的估计量提出了一个估计量:准确性的比率 [公式:见文本] 部分数据集的可靠性(accˆ),以及可靠性的比率(ρˆ)。我们模拟了一个低(0.10)和中度(0.30)遗传力的奶牛计划。在这两种情况下,我们检查了 3 种情况下的估计量的行为:(1)当评估模型与用于模拟数据的模型相同时;(2)当评估模型使用不正确的遗传力时;(3)当数据包含环境趋势时。对于评估模型正确的情况,LR 方法能够正确地估计偏差、斜率和准确性,对于更高的遗传力表现出更好的性能[即,对于 h = 0.10 的 corr(b,bˆ)为 0.45,对于 h = 0.30 的 corr(b,bˆ)为 0.59]。在评估模型中使用不正确遗传力的情况下,偏差的方向得到了正确估计,但幅度没有得到正确估计。同样,在数据中存在环境趋势的情况下,偏差和斜率的幅度被低估,除非当代群体是随机的并且大大收缩。总的来说,在所有情况下都能很好地估计准确性。如果评估模型合理正确或具有鲁棒性,LR 方法能够检查所有情况下的偏差和准确性,并且其估计值在信息量更大(例如,遗传力较高)时更精确。如果模型使用不正确的遗传力或数据中存在隐藏趋势,则仍然可以估计偏差和斜率的方向和存在,但不一定是它们的幅度。