Liu Hailan, Chen Guo-Bo
Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan Province, 611130, China.
Evergreen Landscape and Architecture Studio, Xixi Road 562, Hangzhou, Zhejiang Province, 310026, China.
Theor Appl Genet. 2017 Jun;130(6):1277-1284. doi: 10.1007/s00122-017-2887-3. Epub 2017 Apr 7.
We propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman-Elston (HE) regression and best linear prediction (BLP), called HE-BLP. Genomic best linear unbiased prediction (GBLUP) has been widely used in whole-genome prediction for breeding programs. To determine the total genetic variance of a training population, a linear mixed model (LMM) should be solved via restricted maximum likelihood (REML), whose computational complexity is the cube of the sample size. We proposed a novel computational method combining identical-by-state (IBS)-based Haseman-Elston (HE) regression and best linear prediction (BLP), called HE-BLP. With this method, the total genetic variance can be estimated by solving a simple HE linear regression, which has a computational complex of the sample size squared; therefore, it is suitable for large-scale genomic data, except those with which environmental effects need to be estimated simultaneously, because it does not allow for this estimation. In Monte Carlo simulation studies, the estimated heritability based on HE was identical to that based on REML, and the prediction accuracy via HE-BLP and traditional GBLUP was also quite similar when quantitative trait loci (QTLs) were randomly distributed along the genome and their effects followed a normal distribution. In addition, the kernel row number (KRN) trait in a maize IBM population was used to evaluate the performance of the two methods; the results showed similar prediction accuracy of breeding values despite slightly different estimated heritability via HE and REML, probably due to the underlying genetic architecture. HE-BLP can be a future genomic selection method choice for even larger sets of genomic data in certain special cases where environmental effects can be ignored. The software for HE regression and the simulation program is available online in the Genetic Analysis Repository (GEAR; https://github.com/gc5k/GEAR/wiki).
我们提出了一种用于基因组选择的新型计算方法,该方法将基于状态相同(IBS)的哈斯曼 - 埃尔斯顿(HE)回归与最佳线性预测(BLP)相结合,称为HE - BLP。基因组最佳线性无偏预测(GBLUP)已广泛应用于育种计划的全基因组预测。为了确定训练群体的总遗传方差,应通过限制最大似然法(REML)求解线性混合模型(LMM),其计算复杂度为样本量的立方。我们提出了一种新型计算方法,将基于状态相同(IBS)的哈斯曼 - 埃尔斯顿(HE)回归与最佳线性预测(BLP)相结合,称为HE - BLP。使用这种方法,可以通过求解简单的HE线性回归来估计总遗传方差,其计算复杂度为样本量的平方;因此,它适用于大规模基因组数据,但不适用于需要同时估计环境效应的数据,因为它不允许进行这种估计。在蒙特卡罗模拟研究中,当数量性状位点(QTL)沿基因组随机分布且其效应服从正态分布时,基于HE估计的遗传力与基于REML估计的遗传力相同,并且通过HE - BLP和传统GBLUP的预测准确性也非常相似。此外,利用玉米IBM群体中的穗行数(KRN)性状评估了这两种方法的性能;结果表明,尽管通过HE和REML估计的遗传力略有不同,但育种值的预测准确性相似,这可能是由于潜在的遗传结构所致。在某些可以忽略环境效应的特殊情况下,HE - BLP可能成为未来处理更大规模基因组数据的基因组选择方法。HE回归软件和模拟程序可在遗传分析库(GEAR;https://github.com/gc5k/GEAR/wiki)上在线获取。