Strandén I, Matilainen K, Aamand G P, Mäntysaari E A
Natural Resources Institute Finland (Luke), Green Technology, Biometrical Genetics, Jokioinen, Finland.
NAV Nordic Cattle Genetic Evaluation, Aarhus, Denmark.
J Anim Breed Genet. 2017 Jun;134(3):264-274. doi: 10.1111/jbg.12257.
Single-step genomic BLUP (ssGBLUP) requires a dense matrix of the size equal to the number of genotyped animals in the coefficient matrix of mixed model equations (MME). When the number of genotyped animals is high, solving time of MME will be dominated by this matrix. The matrix is the difference of two inverse relationship matrices: genomic (G) and pedigree (A ). Different approaches were used to ease computations, reduce computing time and improve numerical stability. Inverse of A can be computed as A22-1=A22-A21A11-1A12 where A , i, j = 1,2, are sparse sub-matrices of A , and numbers 1 and 2 refer to non-genotyped and genotyped animals, respectively. Inversion of A was avoided by three alternative approaches: iteration on pedigree (IOP), matrix iteration in memory (IM), and Cholesky decomposition by CHOLMOD library (CM). For the inverse of G, the APY (algorithm for proven and young) approach using Cholesky decomposition was formulated. Different approaches to choose the APY core were compared. These approaches were tested on a joint genetic evaluation of the Nordic Holstein cattle for fertility traits and had 81,031 genotyped animals. Computing time per iteration was 1.19 min by regular ssGBLUP, 1.49 min by IOP, 1.32 min by IM, and 1.21 min by CM. In comparison with the regular ssGBLUP, the total computing time decreased due to omitting the inversion of the relationship matrix A . When APY used 10,000 (20,000) animals in the core, the computing time per iteration was at most 0.44 (0.63) min by all the APY alternatives. A core of 10,000 animals in APY gave GEBVs sufficiently close to those by regular ssGBLUP but needed only 25% of the total computing time. The developed approaches to invert the two relationship matrices are expected to allow much higher number of genotyped animals than was used in this study.
单步基因组最佳线性无偏预测(ssGBLUP)要求在混合模型方程(MME)的系数矩阵中有一个大小等于基因分型动物数量的密集矩阵。当基因分型动物数量较多时,MME的求解时间将由该矩阵主导。该矩阵是两个逆关系矩阵的差:基因组关系矩阵(G)和系谱关系矩阵(A)。人们采用了不同的方法来简化计算、减少计算时间并提高数值稳定性。A的逆可以计算为A22-1 = A22 - A21A11-1A12,其中A,i, j = 1,2是A的稀疏子矩阵,数字1和2分别指非基因分型动物和基因分型动物。通过三种替代方法避免了A的求逆:系谱迭代(IOP)、内存中的矩阵迭代(IM)以及使用CHOLMOD库进行Cholesky分解(CM)。对于G的逆,制定了使用Cholesky分解的APY(经产和青年算法)方法。比较了选择APY核心的不同方法。这些方法在北欧荷斯坦奶牛繁殖性状的联合遗传评估中进行了测试,共有81,031头基因分型动物。常规ssGBLUP每次迭代的计算时间为1.19分钟,IOP为1.49分钟,IM为1.32分钟,CM为1.21分钟。与常规ssGBLUP相比,由于省略了关系矩阵A的求逆,总计算时间减少。当APY在核心中使用10,000(20,000)头动物时,所有APY替代方法每次迭代的计算时间最多为0.44(0.63)分钟。APY中10,000头动物的核心给出的基因组估计育种值(GEBV)与常规ssGBLUP得到的结果足够接近,但只需要总计算时间的25%。预计所开发的求两个关系矩阵逆的方法能够处理比本研究中使用的基因分型动物数量多得多的情况。