Plant Biology and Crop Science, Rothamsted Research, Harpenden, AL5 2JQ, UK.
Agricultural Institute, Centre for Agricultural Research, Hungarian Academy of Sciences, P.O. Box 19. 2462, Martonvásár, Hungary.
BMC Genet. 2015 Feb 26;16:19. doi: 10.1186/s12863-015-0169-0.
Genomic prediction of agronomic traits as targets for selection in plant breeding programmes is increasingly common. The methods employed can also be applied to predict traits from other sources of covariates, such as metabolomics. However, prediction combining sets of covariates can be less accurate than using the best of the individual sets.
We describe a method, termed Differentially Penalized Regression (DiPR), which uses standard ridge regression software to combine sets of covariates while applying independent penalties to each. In a dataset of wheat varieties, field traits are better predicted, on average, by seed metabolites than by genetic markers, but DiPR using both sets of predictors is best.
DiPR is a simple and accessible method of using existing software to combine multiple sets of covariates in trait prediction when there are more predictors than observations and the contribution to accuracy from each set differs.
作为植物育种计划中选择目标的农艺性状的基因组预测越来越普遍。所采用的方法也可应用于预测来自其他协变量源的性状,如代谢组学。然而,与使用最佳单个集合相比,组合协变量集的预测可能不太准确。
我们描述了一种称为差异惩罚回归(DiPR)的方法,该方法使用标准的岭回归软件来组合协变量集,同时对每个协变量集应用独立的惩罚。在小麦品种数据集上,种子代谢物平均比遗传标记更好地预测田间性状,但同时使用两组预测因子的 DiPR 效果最佳。
当预测因子多于观测值且每个集合对准确性的贡献不同时,DiPR 是一种简单且易于使用的方法,可利用现有软件在性状预测中组合多个协变量集。