University of Hohenheim, Stuttgart, Germany,
Theor Appl Genet. 2015 Apr;128(4):693-703. doi: 10.1007/s00122-015-2464-6. Epub 2015 Mar 4.
We evaluated several methods for computing shrinkage estimates of the genomic relationship matrix and demonstrated their potential to enhance the reliability of genomic estimated breeding values of training set individuals. In genomic prediction in plant breeding, the training set constitutes a large fraction of the total number of genotypes assayed and is itself subject to selection. The objective of our study was to investigate whether genomic estimated breeding values (GEBVs) of individuals in the training set can be enhanced by shrinkage estimation of the genomic relationship matrix. We simulated two different population types: a diversity panel of unrelated individuals and a biparental family of doubled haploid lines. For different training set sizes (50, 100, 200), number of markers (50, 100, 200, 500, 2,500) and heritabilities (0.25, 0.5, 0.75), shrinkage coefficients were computed by four different methods. Two of these methods are novel and based on measures of LD, the other two were previously described in the literature, one of which was extended by us. Our results showed that shrinkage estimation of the genomic relationship matrix can significantly improve the reliability of the GEBVs of training set individuals, especially for a low number of markers. We demonstrate that the number of markers is the primary determinant of the optimum shrinkage coefficient maximizing the reliability and we recommend methods eligible for routine usage in practical applications.
我们评估了几种计算基因组关系矩阵收缩估计的方法,并证明了它们有可能提高训练集个体基因组估计育种值的可靠性。在植物育种的基因组预测中,训练集构成了总基因型测定数量的很大一部分,并且本身也受到选择的影响。我们的研究目的是探讨训练集个体的基因组估计育种值(GEBV)是否可以通过基因组关系矩阵的收缩估计来提高。我们模拟了两种不同的群体类型:一个无关个体的多样性面板和一个双单倍体系的双亲家庭。对于不同的训练集大小(50、100、200)、标记数量(50、100、200、500、2500)和遗传力(0.25、0.5、0.75),我们使用了四种不同的方法计算了收缩系数。其中两种方法是新颖的,基于 LD 度量,另外两种方法以前在文献中描述过,其中一种方法是我们扩展的。我们的结果表明,基因组关系矩阵的收缩估计可以显著提高训练集个体的 GEBV 的可靠性,特别是在标记数量较少的情况下。我们证明了标记数量是确定最佳收缩系数的主要决定因素,该系数可以最大限度地提高可靠性,我们推荐了一些适合实际应用中常规使用的方法。