Montesinos-López Osval A, Crespo-Herrera Leonardo, Pierre Carolina Saint, Cano-Paez Bernabe, Huerta-Prado Gloria Isabel, Mosqueda-González Brandon Alejandro, Ramos-Pulido Sofia, Gerard Guillermo, Alnowibet Khalid, Fritsche-Neto Roberto, Montesinos-López Abelardo, Crossa José
Facultad de Telemática, Universidad de Colima, Colima, Mexico.
International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Edo. de Mexico, Mexico.
Front Plant Sci. 2024 May 15;15:1349569. doi: 10.3389/fpls.2024.1349569. eCollection 2024.
Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology.
When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models.
We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.
由于基因组选择(GS)是一种预测方法,在实际应用中需要保证较高的预测准确性。然而,由于许多因素会影响该方法的预测性能,在许多育种计划中,其实际应用仍需改进。因此,人们探索了许多策略来提高该方法的预测性能。
当将环境协变量作为基因组预测模型的输入时,此信息有时仅有助于提高预测性能。因此,本研究探索对环境协变量进行特征工程处理,以提高基因组预测模型的预测性能。
我们发现,在各个数据集中,通过特征工程处理,与未进行特征工程处理仅纳入环境协变量相比,预测误差在所有预测变量上平均降低了761.625%。这些结果对于特征工程提高预测准确性的潜力而言非常有前景。然而,由于仅在部分数据集中观察到预测准确性有显著提高,因此需要进一步研究以确保有一个稳健的特征工程策略来纳入环境协变量。