• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器学习技术整合遗传和环境数据,以提高玉米籽粒产量在多环境试验中的预测能力。

Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials.

机构信息

Department of Crop, Soil, and Environmental Sciences, Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA.

Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, USA.

出版信息

Theor Appl Genet. 2024 Jul 23;137(8):189. doi: 10.1007/s00122-024-04687-w.

DOI:10.1007/s00122-024-04687-w
PMID:39044035
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11266441/
Abstract

Incorporating feature-engineered environmental data into machine learning-based genomic prediction models is an efficient approach to indirectly model genotype-by-environment interactions. Complementing phenotypic traits and molecular markers with high-dimensional data such as climate and soil information is becoming a common practice in breeding programs. This study explored new ways to combine non-genetic information in genomic prediction models using machine learning. Using the multi-environment trial data from the Genomes To Fields initiative, different models to predict maize grain yield were adjusted using various inputs: genetic, environmental, or a combination of both, either in an additive (genetic-and-environmental; G+E) or a multiplicative (genotype-by-environment interaction; GEI) manner. When including environmental data, the mean prediction accuracy of machine learning genomic prediction models increased up to 7% over the well-established Factor Analytic Multiplicative Mixed Model among the three cross-validation scenarios evaluated. Moreover, using the G+E model was more advantageous than the GEI model given the superior, or at least comparable, prediction accuracy, the lower usage of computational memory and time, and the flexibility of accounting for interactions by construction. Our results illustrate the flexibility provided by the ML framework, particularly with feature engineering. We show that the feature engineering stage offers a viable option for envirotyping and generates valuable information for machine learning-based genomic prediction models. Furthermore, we verified that the genotype-by-environment interactions may be considered using tree-based approaches without explicitly including interactions in the model. These findings support the growing interest in merging high-dimensional genotypic and environmental data into predictive modeling.

摘要

将经过特征工程处理的环境数据纳入基于机器学习的基因组预测模型中,是间接模拟基因型与环境互作的有效方法。在育种计划中,用气候和土壤等多维数据补充表型特征和分子标记,已成为一种常见做法。本研究通过机器学习探索了在基因组预测模型中组合非遗传信息的新方法。利用 Genomes To Fields 计划的多环境试验数据,使用不同的模型,通过各种输入(遗传、环境或两者的组合)来调整预测玉米籽粒产量的模型,要么以加性(遗传和环境;G+E)方式,要么以乘法(基因型与环境互作;GEI)方式。在包含环境数据的情况下,在三种交叉验证场景中评估的机器学习基因组预测模型的平均预测准确性比既定的因素分析乘法混合模型提高了 7%。此外,与 GEI 模型相比,使用 G+E 模型更具优势,因为前者具有更高的预测准确性,或者至少具有可比性,使用的计算内存和时间更少,并且可以通过构建灵活地考虑交互作用。我们的结果说明了 ML 框架提供的灵活性,特别是在特征工程方面。我们表明,特征工程阶段为环境分型提供了一种可行的选择,并为基于机器学习的基因组预测模型生成了有价值的信息。此外,我们验证了可以使用基于树的方法来考虑基因型与环境互作,而无需在模型中明确包含互作。这些发现支持了将高维基因型和环境数据合并到预测模型中的日益增长的兴趣。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/1570e42a4404/122_2024_4687_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/b95a333caa5f/122_2024_4687_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/8c273ce5f423/122_2024_4687_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/4a3297fc18e1/122_2024_4687_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/1570e42a4404/122_2024_4687_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/b95a333caa5f/122_2024_4687_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/8c273ce5f423/122_2024_4687_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/4a3297fc18e1/122_2024_4687_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e589/11266441/1570e42a4404/122_2024_4687_Fig4_HTML.jpg

相似文献

1
Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials.利用机器学习技术整合遗传和环境数据,以提高玉米籽粒产量在多环境试验中的预测能力。
Theor Appl Genet. 2024 Jul 23;137(8):189. doi: 10.1007/s00122-024-04687-w.
2
Improving accuracies of genomic predictions for drought tolerance in maize by joint modeling of additive and dominance effects in multi-environment trials.通过联合建模多环境试验中的加性和显性效应来提高玉米抗旱性的基因组预测准确性。
Heredity (Edinb). 2018 Jul;121(1):24-37. doi: 10.1038/s41437-018-0053-6. Epub 2018 Feb 23.
3
Genomic models with genotype × environment interaction for predicting hybrid performance: an application in maize hybrids.用于预测杂种表现的具有基因型×环境互作效应的基因组模型:在玉米杂交种中的应用
Theor Appl Genet. 2017 Jul;130(7):1431-1440. doi: 10.1007/s00122-017-2898-0. Epub 2017 Apr 11.
4
Estimation of physiological genomic estimated breeding values (PGEBV) combining full hyperspectral and marker data across environments for grain yield under combined heat and drought stress in tropical maize (Zea mays L.).在热带玉米(Zea mays L.)中,结合全高光谱和标记数据,估算在热旱胁迫下的籽粒产量的生理基因组估计育种值(PGEBV)。
PLoS One. 2019 Mar 20;14(3):e0212200. doi: 10.1371/journal.pone.0212200. eCollection 2019.
5
Phenomic selection in wheat breeding: prediction of the genotype-by-environment interaction in multi-environment breeding trials.小麦育种中的表型组选择:多环境育种试验中基因型与环境互作的预测
Theor Appl Genet. 2022 Oct;135(10):3337-3356. doi: 10.1007/s00122-022-04170-4. Epub 2022 Aug 8.
6
Field-based high-throughput phenotyping enhances phenomic and genomic predictions for grain yield and plant height across years in maize.基于田间的高通量表型分析提高了玉米多年来的表型和基因组对籽粒产量和株高的预测能力。
G3 (Bethesda). 2024 Jul 8;14(7). doi: 10.1093/g3journal/jkae092.
7
Genomic prediction in multi-environment trials in maize using statistical and machine learning methods.利用统计和机器学习方法在玉米多环境试验中进行基因组预测。
Sci Rep. 2024 Jan 11;14(1):1062. doi: 10.1038/s41598-024-51792-3.
8
The importance of dominance and genotype-by-environment interactions on grain yield variation in a large-scale public cooperative maize experiment.在大规模公共合作玉米实验中,优势和基因型-环境互作对粒重变化的重要性。
G3 (Bethesda). 2021 Feb 9;11(2). doi: 10.1093/g3journal/jkaa050.
9
Enviromic-based kernels may optimize resource allocation with multi-trait multi-environment genomic prediction for tropical Maize.基于环境组学的核函数可通过热带玉米的多性状多环境基因组预测来优化资源分配。
BMC Plant Biol. 2023 Jan 5;23(1):10. doi: 10.1186/s12870-022-03975-1.
10
Dominance Effects and Functional Enrichments Improve Prediction of Agronomic Traits in Hybrid Maize.杂种玉米农艺性状预测的优势效应和功能富集改善
Genetics. 2020 May;215(1):215-230. doi: 10.1534/genetics.120.303025. Epub 2020 Mar 9.

引用本文的文献

1
Climate-Resilient Crops: Integrating AI, Multi-Omics, and Advanced Phenotyping to Address Global Agricultural and Societal Challenges.气候适应型作物:整合人工智能、多组学和先进表型分析以应对全球农业和社会挑战
Plants (Basel). 2025 Aug 29;14(17):2699. doi: 10.3390/plants14172699.
2
Integrating multi-omics and machine learning for disease resistance prediction in legumes.整合多组学和机器学习用于豆类抗病性预测
Theor Appl Genet. 2025 Jun 27;138(7):163. doi: 10.1007/s00122-025-04948-2.
3
Improving plant breeding through AI-supported data integration.

本文引用的文献

1
Leveraging data from the Genomes-to-Fields Initiative to investigate genotype-by-environment interactions in maize in North America.利用 Genomes-to-Fields Initiative 的数据来研究北美玉米中的基因型-环境互作。
Nat Commun. 2023 Oct 30;14(1):6904. doi: 10.1038/s41467-023-42687-4.
2
AGHmatrix: genetic relationship matrices in R.AGHmatrix:R 中的遗传关系矩阵。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad445.
3
Genomes to Fields 2022 Maize genotype by Environment Prediction Competition.2022 年基因组到田间玉米基因型与环境预测竞赛。
通过人工智能支持的数据整合改进植物育种。
Theor Appl Genet. 2025 Jun 2;138(6):132. doi: 10.1007/s00122-025-04910-2.
4
Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimates.全球基因型与环境互作预测竞赛表明,多种建模策略均可提供令人满意的玉米产量估计。
Genetics. 2025 Feb 5;229(2). doi: 10.1093/genetics/iyae195.
5
Global Genotype by Environment Prediction Competition Reveals That Diverse Modeling Strategies Can Deliver Satisfactory Maize Yield Estimates.全球基因型与环境预测竞赛表明,多种建模策略可提供令人满意的玉米产量估计。
bioRxiv. 2024 Sep 19:2024.09.13.612969. doi: 10.1101/2024.09.13.612969.
BMC Res Notes. 2023 Jul 17;16(1):148. doi: 10.1186/s13104-023-06421-z.
4
Yield prediction through integration of genetic, environment, and management data through deep learning.通过深度学习整合遗传、环境和管理数据进行产量预测。
G3 (Bethesda). 2023 Apr 11;13(4). doi: 10.1093/g3journal/jkad006.
5
Comparing artificial-intelligence techniques with state-of-the-art parametric prediction models for predicting soybean traits.将人工智能技术与用于预测大豆性状的最先进参数预测模型进行比较。
Plant Genome. 2023 Mar;16(1):e20263. doi: 10.1002/tpg2.20263. Epub 2022 Dec 9.
6
Envirome-wide associations enhance multi-year genome-based prediction of historical wheat breeding data.全环境关联增强了基于基因组的多年历史小麦育种数据的预测。
G3 (Bethesda). 2023 Feb 9;13(2). doi: 10.1093/g3journal/jkac313.
7
Using genomic prediction with crop growth models enables the prediction of associated traits in wheat.利用基因组预测与作物生长模型,可以预测小麦的相关性状。
J Exp Bot. 2023 Mar 13;74(5):1389-1402. doi: 10.1093/jxb/erac393.
8
Incorporation of Soil-Derived Covariates in Progeny Testing and Line Selection to Enhance Genomic Prediction Accuracy in Soybean Breeding.将土壤衍生协变量纳入后代测试和品系选择以提高大豆育种中的基因组预测准确性。
Front Genet. 2022 Sep 8;13:905824. doi: 10.3389/fgene.2022.905824. eCollection 2022.
9
The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation.实用单体型图:一个用于存储和使用泛基因组进行推断的平台。
Bioinformatics. 2022 Aug 2;38(15):3698-3702. doi: 10.1093/bioinformatics/btac410.
10
Integration of Crop Growth Models and Genomic Prediction.作物生长模型与基因组预测的整合
Methods Mol Biol. 2022;2467:359-396. doi: 10.1007/978-1-0716-2205-6_13.