• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于亲缘关系信息的堆叠集成方法能够预测杂种性能,其准确性与基于标记的基因组最佳线性无偏预测法相当。

Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP.

作者信息

Heilmann Philipp Georg, Frisch Matthias, Abbadi Amine, Kox Tobias, Herzog Eva

机构信息

Institute of Agronomy and Plant Breeding II, Justus Liebig University, Gießen, Germany.

NPZ Innovation GmbH, Holtsee, Germany.

出版信息

Front Plant Sci. 2023 Jul 21;14:1178902. doi: 10.3389/fpls.2023.1178902. eCollection 2023.

DOI:10.3389/fpls.2023.1178902
PMID:37546247
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10401275/
Abstract

Testcross factorials in newly established hybrid breeding programs are often highly unbalanced, incomplete, and characterized by predominance of special combining ability (SCA) over general combining ability (GCA). This results in a low efficiency of GCA-based selection. Machine learning algorithms might improve prediction of hybrid performance in such testcross factorials, as they have been successfully applied to find complex underlying patterns in sparse data. Our objective was to compare the prediction accuracy of machine learning algorithms to that of GCA-based prediction and genomic best linear unbiased prediction (GBLUP) in six unbalanced incomplete factorials from hybrid breeding programs of rapeseed, wheat, and corn. We investigated a range of machine learning algorithms with three different types of predictor variables: (a) information on parentage of hybrids, (b) in addition hybrid performance of crosses of the parental lines with other crossing partners, and (c) genotypic marker data. In two highly incomplete and unbalanced factorials from rapeseed, in which the SCA variance contributed considerably to the genetic variance, stacked ensembles of gradient boosting machines based on parentage information outperformed GCA prediction. The stacked ensembles increased prediction accuracy from 0.39 to 0.45, and from 0.48 to 0.54 compared to GCA prediction. The prediction accuracy reached by stacked ensembles without marker data reached values comparable to those of GBLUP that requires marker data. We conclude that hybrid prediction with stacked ensembles of gradient boosting machines based on parentage information is a promising approach that is worth further investigations with other data sets in which SCA variance is high.

摘要

在新建立的杂交育种计划中,测交析因试验往往高度不平衡、不完整,且具有特殊配合力(SCA)高于一般配合力(GCA)的特点。这导致基于GCA的选择效率低下。机器学习算法可能会提高此类测交析因试验中杂种性能的预测,因为它们已成功应用于在稀疏数据中发现复杂的潜在模式。我们的目标是比较机器学习算法与基于GCA的预测以及基因组最佳线性无偏预测(GBLUP)在油菜、小麦和玉米杂交育种计划的六个不平衡不完全析因试验中的预测准确性。我们研究了一系列机器学习算法,使用三种不同类型的预测变量:(a)杂种亲本信息,(b)此外还有亲本系与其他杂交亲本杂交的杂种性能,以及(c)基因型标记数据。在油菜的两个高度不完全且不平衡的析因试验中,SCA方差对遗传方差有很大贡献,基于亲本信息的梯度提升机堆叠集成模型的表现优于GCA预测。与GCA预测相比,堆叠集成模型将预测准确性从0.39提高到0.45,从0.48提高到0.54。没有标记数据的堆叠集成模型达到的预测准确性与需要标记数据的GBLUP相当。我们得出结论,基于亲本信息的梯度提升机堆叠集成模型进行杂种预测是一种很有前景的方法,值得在其他SCA方差较高的数据集中进一步研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/435d1658ade8/fpls-14-1178902-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/14f6dfff5f10/fpls-14-1178902-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/e305cb9c3e42/fpls-14-1178902-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/435d1658ade8/fpls-14-1178902-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/14f6dfff5f10/fpls-14-1178902-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/e305cb9c3e42/fpls-14-1178902-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d60c/10401275/435d1658ade8/fpls-14-1178902-g003.jpg

相似文献

1
Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP.基于亲缘关系信息的堆叠集成方法能够预测杂种性能,其准确性与基于标记的基因组最佳线性无偏预测法相当。
Front Plant Sci. 2023 Jul 21;14:1178902. doi: 10.3389/fpls.2023.1178902. eCollection 2023.
2
Prediction of additive, epistatic, and dominance effects using models accounting for incomplete inbreeding in parental lines of hybrid rye and sugar beet.利用考虑杂交黑麦和甜菜亲本系不完全近交的模型预测加性、上位性和显性效应。
Front Plant Sci. 2023 Nov 2;14:1193433. doi: 10.3389/fpls.2023.1193433. eCollection 2023.
3
Prediction of single-cross hybrid performance for grain yield and grain dry matter content in maize using AFLP markers associated with QTL.利用与数量性状基因座相关的AFLP标记预测玉米单交种的籽粒产量和籽粒干物质含量杂种表现。
Theor Appl Genet. 2006 Oct;113(6):1037-47. doi: 10.1007/s00122-006-0363-6. Epub 2006 Aug 3.
4
Prediction of hybrid performance in maize using molecular markers and joint analyses of hybrids and parental inbreds.利用分子标记预测玉米杂种优势及其与亲本自交系的联合分析。
Theor Appl Genet. 2010 Jan;120(2):451-61. doi: 10.1007/s00122-009-1208-x. Epub 2009 Nov 15.
5
Genome-wide regression models considering general and specific combining ability predict hybrid performance in oilseed rape with similar accuracy regardless of trait architecture.考虑一般配合力和特殊配合力的全基因组回归模型,无论性状结构如何,都能以相似的准确性预测油菜的杂种表现。
Theor Appl Genet. 2018 Feb;131(2):299-317. doi: 10.1007/s00122-017-3002-5. Epub 2017 Oct 28.
6
Genomic Prediction of Sunflower Hybrids Oil Content.向日葵杂交种含油量的基因组预测
Front Plant Sci. 2017 Sep 21;8:1633. doi: 10.3389/fpls.2017.01633. eCollection 2017.
7
Revisiting hybrid breeding designs using genomic predictions: simulations highlight the superiority of incomplete factorials between segregating families over topcross designs.重新审视使用基因组预测的杂种选育设计:模拟结果突出显示了分离子代间不完全析因设计相对于顶交设计的优越性。
Theor Appl Genet. 2020 Jun;133(6):1995-2010. doi: 10.1007/s00122-020-03573-5. Epub 2020 Mar 17.
8
Efficient Genomic Prediction of Yield and Dry Matter in Hybrid Potato.杂交马铃薯产量和干物质的高效基因组预测
Plants (Basel). 2023 Jul 11;12(14):2617. doi: 10.3390/plants12142617.
9
Heterotic patterns in rapeseed (Brassica napus L.): I. Crosses between spring and Chinese semi-winter lines.油菜(甘蓝型油菜)的杂种优势模式:I. 春性品系与中国半冬性品系之间的杂交
Theor Appl Genet. 2007 Jun;115(1):27-34. doi: 10.1007/s00122-007-0537-x. Epub 2007 Apr 24.
10
Prediction of single-cross hybrid performance in maize using haplotype blocks associated with QTL for grain yield.利用与籽粒产量QTL相关的单倍型块预测玉米单交种的杂种表现。
Theor Appl Genet. 2007 May;114(8):1345-55. doi: 10.1007/s00122-007-0521-5. Epub 2007 Feb 24.

引用本文的文献

1
Breeding perspectives on tackling trait genome-to-phenome (G2P) dimensionality using ensemble-based genomic prediction.利用基于集成的基因组预测解决性状基因组到表型(G2P)维度问题的育种前景。
Theor Appl Genet. 2025 Jul 4;138(7):172. doi: 10.1007/s00122-025-04960-6.
2
Improved genomic prediction performance with ensembles of diverse models.通过多种不同模型的集成提高基因组预测性能。
G3 (Bethesda). 2025 May 8;15(5). doi: 10.1093/g3journal/jkaf048.
3
Portability of genomic predictions trained on sparse factorial designs across two maize silage breeding cycles.

本文引用的文献

1
Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization.通过结合树状 Parzen 估计器进行超参数优化的机器学习改进基因组预测。
Biology (Basel). 2022 Nov 11;11(11):1647. doi: 10.3390/biology11111647.
2
Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations.应用于育种群体非线性特征搜索的机器学习
Front Artif Intell. 2022 May 20;5:876578. doi: 10.3389/frai.2022.876578. eCollection 2022.
3
Crucial factors for the feasibility of commercial hybrid breeding in food crops.
基于稀疏析因设计的基因组预测在两个玉米青贮育种周期中的可转移性。
Theor Appl Genet. 2024 Mar 7;137(3):75. doi: 10.1007/s00122-024-04566-4.
商业化杂交育种在粮食作物中可行性的关键因素。
Nat Plants. 2022 May;8(5):463-473. doi: 10.1038/s41477-022-01142-w. Epub 2022 May 5.
4
Automated Machine Learning: A Case Study of Genomic "Image-Based" Prediction in Maize Hybrids.自动化机器学习:玉米杂交种中基于基因组“图像”预测的案例研究。
Front Plant Sci. 2022 Mar 7;13:845524. doi: 10.3389/fpls.2022.845524. eCollection 2022.
5
Prediction of Maize Phenotypic Traits With Genomic and Environmental Predictors Using Gradient Boosting Frameworks.使用梯度提升框架,通过基因组和环境预测因子预测玉米表型性状
Front Plant Sci. 2021 Nov 11;12:699589. doi: 10.3389/fpls.2021.699589. eCollection 2021.
6
LightGBM: accelerated genomically designed crop breeding through ensemble learning.LightGBM:通过集成学习加速基因组设计的作物育种。
Genome Biol. 2021 Sep 20;22(1):271. doi: 10.1186/s13059-021-02492-y.
7
Predicting phenotypes from genetic, environment, management, and historical data using CNNs.使用卷积神经网络从遗传、环境、管理和历史数据预测表型。
Theor Appl Genet. 2021 Dec;134(12):3997-4011. doi: 10.1007/s00122-021-03943-7. Epub 2021 Aug 27.
8
Unlocking big data doubled the accuracy in predicting the grain yield in hybrid wheat.利用大数据使杂交小麦产量预测的准确率提高了一倍。
Sci Adv. 2021 Jun 11;7(24). doi: 10.1126/sciadv.abf9106. Print 2021 Jun.
9
A Stacking Ensemble Learning Framework for Genomic Prediction.一种用于基因组预测的堆叠集成学习框架。
Front Genet. 2021 Mar 4;12:600040. doi: 10.3389/fgene.2021.600040. eCollection 2021.
10
Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt.机器学习与作物模型结合可提高美国玉米带的作物产量预测精度。
Sci Rep. 2021 Jan 15;11(1):1606. doi: 10.1038/s41598-020-80820-1.