Department of Botany and Plant Sciences, University of California, Riverside, CA, 92521, USA.
College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu, China.
Heredity (Edinb). 2018 Jul;121(1):12-23. doi: 10.1038/s41437-018-0078-x. Epub 2018 May 1.
Many statistical methods are available for genomic selection (GS) through which genetic values of quantitative traits are predicted for plants and animals using whole-genome SNP data. A large number of predictors with much fewer subjects become a major computational challenge in GS. Principal components regression (PCR) and its derivative, i.e., partial least squares regression (PLSR), provide a solution through dimensionality reduction. In this study, we show that PCR can perform better than PLSR in cross validation. PCR often requires extracting more components to achieve the maximum predictive ability than PLSR and thus may be associated with a higher computational cost. However, application of the HAT method (a strategy of describing the relationship between the fitted and observed response variables with a hat matrix) to PCR circumvents conventional cross validation in testing predictive ability, resulting in substantially improved computational efficiency over PLSR where cross validation is mandatory. Advantages of PCR over PLSR are illustrated with a simulated trait of a hypothetical population and four agronomical traits of a rice population. The benefit of using PCR in genomic selection is further demonstrated in an effort to predict 1000 metabolomic traits and 24,973 transcriptomic traits in the same rice population.
许多统计方法可用于基因组选择(GS),通过使用全基因组 SNP 数据来预测植物和动物的数量性状的遗传值。在 GS 中,大量的预测因子与更少的主体成为一个主要的计算挑战。主成分回归(PCR)及其衍生方法,即偏最小二乘回归(PLSR),通过降维提供了一种解决方案。在这项研究中,我们表明 PCR 在交叉验证中可以比 PLSR 表现得更好。PCR 通常需要提取更多的成分来达到最大的预测能力,而不是 PLSR,因此可能与更高的计算成本有关。然而,应用 HAT 方法(一种用帽子矩阵描述拟合和观测响应变量之间关系的策略)到 PCR 中,可以避免传统的交叉验证测试预测能力,从而大大提高了计算效率,而 PLSR 则需要强制进行交叉验证。PCR 优于 PLSR 的优点通过一个假设群体的模拟性状和一个水稻群体的四个农艺性状来说明。在同一水稻群体中,PCR 在基因组选择中的应用进一步证明了其预测 1000 个代谢性状和 24973 个转录组性状的优势。