Suppr超能文献

用于选择环境协变量的特征选择方法能否提高基因组预测准确性?

Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy?

作者信息

Montesinos-López Osval A, Crespo-Herrera Leonardo, Saint Pierre Carolina, Bentley Alison R, de la Rosa-Santamaria Roberto, Ascencio-Laguna José Alejandro, Agbona Afolabi, Gerard Guillermo S, Montesinos-López Abelardo, Crossa José

机构信息

Facultad de Telemática, Universidad de Colima, Colima, Mexico.

International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico.

出版信息

Front Genet. 2023 Jul 24;14:1209275. doi: 10.3389/fgene.2023.1209275. eCollection 2023.

Abstract

Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson's correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.

摘要

基因组选择(GS)正在改变动植物育种,但在复杂性状和多环境试验中的实际应用仍然具有挑战性。为了解决这个问题,本研究探讨了在基因组选择中环境信息与基因型信息的整合。该研究提出使用两种特征选择方法(皮尔逊相关性和Boruta)来整合环境信息。结果表明,根据具体情况,简单纳入环境协变量可能会提高或降低预测准确性。然而,在留一环境交叉验证方案下,使用特征选择对环境协变量进行最优纳入,在六个数据集中有四个数据集显著提高了预测准确性,以归一化均方根误差衡量,提高幅度在14.25%至218.71%之间,但在皮尔逊相关性方面未观察到相关增益。在两个环境协变量与响应变量无关的数据集中,特征选择无法提高预测准确性。因此,该研究提供了实证证据,支持使用特征选择来提高基因组选择的预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7256/10405933/1f2fc8c2566f/fgene-14-1209275-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验