基于基因表达数据的偏最小二乘回归、支持向量机回归和转录组距离预测玉米杂种表现。

Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data.

机构信息

Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China.

出版信息

Theor Appl Genet. 2012 Mar;124(5):825-33. doi: 10.1007/s00122-011-1747-9. Epub 2011 Nov 19.

PMID:22101908

Abstract

The performance of hybrids can be predicted with gene expression data from their parental inbred lines. Implementing such prediction approaches in breeding programs promises to increase the efficiency of hybrid breeding. The objectives of our study were to compare the accuracy of prediction models employing multiple linear regression (MLR), partial least squares regression (PLS), support vector machine regression (SVM), and transcriptome-based distances (D(B)). For a factorial of 7 flint and 14 dent maize lines, the grain yield of the hybrids was assessed and the gene expression of the parental lines was profiled with a 56k microarray. The accuracy of the prediction models was measured by the correlation between predicted and observed yield employing two cross-validation schemes. The first modeled the prediction of hybrids when testcross data are available for both parental lines (type 2 hybrids), and the second modeled the prediction of hybrids when no testcross data for the parental lines were available (type 0 hybrids). MLR, SVM, and PLS resulted in a high correlation between predicted and observed yield for type 2 hybrids, whereas for type 0 hybrids D(B) had greater prediction accuracy. The regression methods were robust to the choice of the set of profiled genes and required only a few hundred genes. In contrast, for an accurate hybrid prediction with D(B), 1,000-1,500 genes were required, and the prediction accuracy depended strongly on the set of profiled genes. We conclude that for prediction within one set of genetic material MLR is a promising approach, and for transferring prediction models from one set of genetic material to a related one, the transcriptome-based distance D(B) is most promising.

摘要

杂种的表现可以通过其亲本自交系的基因表达数据来预测。在育种计划中实施这种预测方法有望提高杂种育种的效率。我们的研究目的是比较采用多元线性回归（MLR）、偏最小二乘回归（PLS）、支持向量机回归（SVM）和基于转录组的距离（D（B））的预测模型的准确性。对于 7 个硬质玉米和 14 个马齿玉米系的因子，评估了杂种的籽粒产量，并使用 56k 微阵列对亲本系的基因表达进行了分析。通过两种交叉验证方案，使用预测和观察到的产量之间的相关性来衡量预测模型的准确性。第一种方案模拟了当测试杂交数据可用于两个亲本系（2 型杂种）时的杂种预测，第二种方案模拟了当没有亲本系的测试杂交数据时的杂种预测（0 型杂种）。对于 2 型杂种，MLR、SVM 和 PLS 导致预测和观察到的产量之间具有高度相关性，而对于 0 型杂种，D（B）具有更高的预测准确性。回归方法对被分析基因集的选择具有鲁棒性，只需要几百个基因。相比之下，对于准确的杂种预测，需要 1000-1500 个基因，并且预测准确性强烈依赖于被分析基因集。我们得出结论，对于在一组遗传物质内的预测，MLR 是一种很有前途的方法，而对于将预测模型从一组遗传物质转移到相关的遗传物质，基于转录组的距离 D（B）是最有前途的。

相似文献

Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data.基于基因表达数据的偏最小二乘回归、支持向量机回归和转录组距离预测玉米杂种表现。

Theor Appl Genet. 2012 Mar;124(5):825-33. doi: 10.1007/s00122-011-1747-9. Epub 2011 Nov 19.

Genome properties and prospects of genomic prediction of hybrid performance in a breeding program of maize.玉米育种计划中杂种优势的基因组特性及基因组预测前景

Genetics. 2014 Aug;197(4):1343-55. doi: 10.1534/genetics.114.165860. Epub 2014 May 21.

Transcriptome-based distance measures for grouping of germplasm and prediction of hybrid performance in maize.基于转录组的距离度量方法可用于玉米种质资源的分组和杂种表现的预测。

Theor Appl Genet. 2010 Jan;120(2):441-50. doi: 10.1007/s00122-009-1204-1. Epub 2009 Nov 13.

Prediction of single-cross hybrid performance in maize using haplotype blocks associated with QTL for grain yield.利用与籽粒产量QTL相关的单倍型块预测玉米单交种的杂种表现。

Theor Appl Genet. 2007 May;114(8):1345-55. doi: 10.1007/s00122-007-0521-5. Epub 2007 Feb 24.

Prediction of hybrid performance in maize with a ridge regression model employed to DNA markers and mRNA transcription profiles.利用岭回归模型结合DNA标记和mRNA转录谱预测玉米的杂种表现。

BMC Genomics. 2016 Mar 29;17:262. doi: 10.1186/s12864-016-2580-y.

Molecular marker-based prediction of hybrid performance in maize using unbalanced data from multiple experiments with factorial crosses.利用多因素杂交的多个实验中的不平衡数据，基于分子标记预测玉米杂交种表现。

Theor Appl Genet. 2009 Feb;118(4):741-51. doi: 10.1007/s00122-008-0934-9. Epub 2008 Dec 2.

Prediction of single-cross hybrid performance for grain yield and grain dry matter content in maize using AFLP markers associated with QTL.利用与数量性状基因座相关的AFLP标记预测玉米单交种的籽粒产量和籽粒干物质含量杂种表现。

Theor Appl Genet. 2006 Oct;113(6):1037-47. doi: 10.1007/s00122-006-0363-6. Epub 2006 Aug 3.

Classification-driven framework to predict maize hybrid field performance from metabolic profiles of young parental roots.基于分类的框架，根据幼龄亲本根的代谢谱预测玉米杂种田间表现。

PLoS One. 2018 Apr 26;13(4):e0196038. doi: 10.1371/journal.pone.0196038. eCollection 2018.

Prediction of hybrid performance in maize using molecular markers and joint analyses of hybrids and parental inbreds.利用分子标记预测玉米杂种优势及其与亲本自交系的联合分析。

Theor Appl Genet. 2010 Jan;120(2):451-61. doi: 10.1007/s00122-009-1208-x. Epub 2009 Nov 15.

Small RNA-based prediction of hybrid performance in maize.基于小 RNA 的玉米杂种优势预测。

BMC Genomics. 2018 May 21;19(1):371. doi: 10.1186/s12864-018-4708-8.

引用本文的文献

Genomic selection: Essence, applications, and prospects.基因组选择：本质、应用与前景。

Plant Genome. 2025 Jun;18(2):e70053. doi: 10.1002/tpg2.70053.

Integrating Genetic Diversity and Agronomic Innovations for Climate-Resilient Maize Systems.整合遗传多样性与农艺创新，打造气候适应型玉米种植体系。

Plants (Basel). 2025 May 21;14(10):1552. doi: 10.3390/plants14101552.

Incorporating gene expression and environment for genomic prediction in wheat.整合基因表达与环境用于小麦基因组预测

Front Plant Sci. 2025 May 6;16:1506434. doi: 10.3389/fpls.2025.1506434. eCollection 2025.

Transcriptomic prediction of breeding values in loblolly pine.火炬松育种值的转录组预测

PLoS One. 2025 Apr 23;20(4):e0319425. doi: 10.1371/journal.pone.0319425. eCollection 2025.

Using phenomic selection to predict hybrid values with NIR spectra measured on the parental lines: proof of concept on maize.利用表型组选择，通过对亲本系测量的近红外光谱预测杂种值：玉米的概念验证

Theor Appl Genet. 2025 Jan 11;138(1):28. doi: 10.1007/s00122-024-04809-4.

Big data and artificial intelligence-aided crop breeding: Progress and prospects.大数据与人工智能辅助作物育种：进展与展望

J Integr Plant Biol. 2025 Mar;67(3):722-739. doi: 10.1111/jipb.13791. Epub 2024 Oct 28.

High-dimensional multi-omics measured in controlled conditions are useful for maize platform and field trait predictions.在受控条件下测量的高维多组学数据可用于玉米平台和田间性状预测。

Theor Appl Genet. 2024 Jul 3;137(7):175. doi: 10.1007/s00122-024-04679-w.

Correlation between Parental Transcriptome and Field Data for the Characterization of Heterosis in Chinese Cabbage.甘蓝型油菜杂种优势的转录组与田间数据的相关性分析。

Genes (Basel). 2023 Mar 23;14(4):776. doi: 10.3390/genes14040776.

Genomic prediction of rice mesocotyl length indicative of directing seeding suitability using a half-sib hybrid population.利用半同胞杂种群体对水稻中胚轴长度进行基因组预测，以指示定向播种的适宜性。

PLoS One. 2023 Apr 5;18(4):e0283989. doi: 10.1371/journal.pone.0283989. eCollection 2023.

De novo genome assembly and analyses of 12 founder inbred lines provide insights into maize heterosis.对12个创始自交系进行从头基因组组装和分析，为玉米杂种优势提供了见解。

Nat Genet. 2023 Feb;55(2):312-323. doi: 10.1038/s41588-022-01283-w. Epub 2023 Jan 16.

本文引用的文献

Dissecting grain yield pathways and their interactions with grain dry matter content by a two-step correlation approach with maize seedling transcriptome.利用玉米幼苗转录组的两步相关方法剖析粒产量途径及其与粒干物质含量的相互作用。

BMC Plant Biol. 2010 Apr 12;10:63. doi: 10.1186/1471-2229-10-63.

Prediction of hybrid biomass in Arabidopsis thaliana by selected parental SNP and metabolic markers.通过选择的亲本 SNP 和代谢标记预测拟南芥的杂种生物量。

Theor Appl Genet. 2010 Jan;120(2):239-47. doi: 10.1007/s00122-009-1191-2. Epub 2009 Nov 13.

Theor Appl Genet. 2010 Jan;120(2):441-50. doi: 10.1007/s00122-009-1204-1. Epub 2009 Nov 13.

Correlation between parental transcriptome and field data for the characterization of heterosis in Zea mays L.玉米杂种优势的表型和转录组关联分析

Theor Appl Genet. 2010 Jan;120(2):401-13. doi: 10.1007/s00122-009-1189-9. Epub 2009 Nov 4.

Improved heterosis prediction by combining information on DNA- and metabolic markers.通过整合DNA和代谢标记信息改进杂种优势预测

PLoS One. 2009;4(4):e5220. doi: 10.1371/journal.pone.0005220. Epub 2009 Apr 16.

Theor Appl Genet. 2009 Feb;118(4):741-51. doi: 10.1007/s00122-008-0934-9. Epub 2008 Dec 2.

Support vector machine regression for the prediction of maize hybrid performance.支持向量机回归用于预测玉米杂交种性能。

Theor Appl Genet. 2007 Nov;115(7):1003-13. doi: 10.1007/s00122-007-0627-9. Epub 2007 Sep 6.

Allelic variation and heterosis in maize: how do two halves make more than a whole?玉米中的等位基因变异与杂种优势：两个一半如何产生大于一个整体的效果？

Genome Res. 2007 Mar;17(3):264-75. doi: 10.1101/gr.5347007. Epub 2007 Jan 25.

Theor Appl Genet. 2006 Oct;113(6):1037-47. doi: 10.1007/s00122-006-0363-6. Epub 2006 Aug 3.

Linear models and empirical bayes methods for assessing differential expression in microarray experiments.用于评估微阵列实验中差异表达的线性模型和经验贝叶斯方法。

Stat Appl Genet Mol Biol. 2004;3:Article3. doi: 10.2202/1544-6115.1027. Epub 2004 Feb 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于基因表达数据的偏最小二乘回归、支持向量机回归和转录组距离预测玉米杂种表现。

Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献