Biosciences Research Division, Department of Primary Industries Victoria, Bundoora, Australia.
Genet Sel Evol. 2009 Nov 24;41(1):51. doi: 10.1186/1297-9686-41-51.
Two key findings from genomic selection experiments are 1) the reference population used must be very large to subsequently predict accurate genomic estimated breeding values (GEBV), and 2) prediction equations derived in one breed do not predict accurate GEBV when applied to other breeds. Both findings are a problem for breeds where the number of individuals in the reference population is limited. A multi-breed reference population is a potential solution, and here we investigate the accuracies of GEBV in Holstein dairy cattle and Jersey dairy cattle when the reference population is single breed or multi-breed. The accuracies were obtained both as a function of elements of the inverse coefficient matrix and from the realised accuracies of GEBV.
Best linear unbiased prediction with a multi-breed genomic relationship matrix (GBLUP) and two Bayesian methods (BAYESA and BAYES_SSVS) which estimate individual SNP effects were used to predict GEBV for 400 and 77 young Holstein and Jersey bulls respectively, from a reference population of 781 and 287 Holstein and Jersey bulls, respectively. Genotypes of 39,048 SNP markers were used. Phenotypes in the reference population were de-regressed breeding values for production traits. For the GBLUP method, expected accuracies calculated from the diagonal of the inverse of coefficient matrix were compared to realised accuracies.
When GBLUP was used, expected accuracies from a function of elements of the inverse coefficient matrix agreed reasonably well with realised accuracies calculated from the correlation between GEBV and EBV in single breed populations, but not in multi-breed populations. When the Bayesian methods were used, realised accuracies of GEBV were up to 13% higher when the multi-breed reference population was used than when a pure breed reference was used. However no consistent increase in accuracy across traits was obtained.
Predicting genomic breeding values using a genomic relationship matrix is an attractive approach to implement genomic selection as expected accuracies of GEBV can be readily derived. However in multi-breed populations, Bayesian approaches give higher accuracies for some traits. Finally, multi-breed reference populations will be a valuable resource to fine map QTL.
基因组选择实验的两个关键发现是 1)用于预测准确的基因组估计育种值(GEBV)的参考群体必须非常大,2)在一个品种中得出的预测方程应用于其他品种时不能准确预测 GEBV。这两个发现对于参考群体中个体数量有限的品种来说是一个问题。多品种参考群体是一种潜在的解决方案,在这里,我们研究了荷斯坦奶牛和泽西奶牛的 GEBV 准确性,当参考群体是单品种或多品种时。准确性是通过逆系数矩阵的元素和 GEBV 的实际准确性来获得的。
使用多品种基因组关系矩阵(GBLUP)和两种估计个体 SNP 效应的贝叶斯方法(BAYESA 和 BAYES_SSVS)进行最佳线性无偏预测,以预测分别来自 781 头荷斯坦和 287 头泽西公牛的参考群体的 400 头和 77 头年轻荷斯坦和泽西公牛的 GEBV。使用了 39048 个 SNP 标记的基因型。参考群体中的表型是生产性状的去回归育种值。对于 GBLUP 方法,从系数矩阵逆的元素函数中计算的预期准确性与在单品种群体中从 GEBV 和 EBV 之间的相关性计算的实际准确性进行了比较。
当使用 GBLUP 时,从逆系数矩阵元素函数得出的预期准确性与从单品种群体中 GEBV 和 EBV 之间的相关性计算的实际准确性相当吻合,但在多品种群体中则不然。当使用贝叶斯方法时,与使用纯品种参考群体相比,使用多品种参考群体时,GEBV 的实际准确性最高可提高 13%。然而,没有在整个性状上获得一致的准确性提高。
使用基因组关系矩阵预测基因组育种值是实施基因组选择的一种有吸引力的方法,因为可以很容易地得出 GEBV 的预期准确性。然而,在多品种群体中,贝叶斯方法对某些性状的准确性更高。最后,多品种参考群体将是精细定位 QTL 的有价值的资源。