• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

应用于基因组选择中高维问题的L2增强算法。

L2-Boosting algorithm applied to high-dimensional problems in genomic selection.

作者信息

González-Recio Oscar, Weigel Kent A, Gianola Daniel, Naya Hugo, Rosa Guilherme J M

机构信息

Departamento de Mejora Genética Animal, Instituto Nacional de Investigaciones Agrarias, Madrid 28040, Spain.

出版信息

Genet Res (Camb). 2010 Jun;92(3):227-37. doi: 10.1017/S0016672310000261.

DOI:10.1017/S0016672310000261
PMID:20667166
Abstract

The L(2)-Boosting algorithm is one of the most promising machine-learning techniques that has appeared in recent decades. It may be applied to high-dimensional problems such as whole-genome studies, and it is relatively simple from a computational point of view. In this study, we used this algorithm in a genomic selection context to make predictions of yet to be observed outcomes. Two data sets were used: (1) productive lifetime predicted transmitting abilities from 4702 Holstein sires genotyped for 32 611 single nucleotide polymorphisms (SNPs) derived from the Illumina BovineSNP50 BeadChip, and (2) progeny averages of food conversion rate, pre-corrected by environmental and mate effects, in 394 broilers genotyped for 3481 SNPs. Each of these data sets was split into training and testing sets, the latter comprising dairy or broiler sires whose ancestors were in the training set. Two weak learners, ordinary least squares (OLS) and non-parametric (NP) regression were used for the L2-Boosting algorithm, to provide a stringent evaluation of the procedure. This algorithm was compared with BL [Bayesian LASSO (least absolute shrinkage and selection operator)] and BayesA regression. Learning tasks were carried out in the training set, whereas validation of the models was performed in the testing set. Pearson correlations between predicted and observed responses in the dairy cattle (broiler) data set were 0.65 (0.33), 0.53 (0.37), 0.66 (0.26) and 0.63 (0.27) for OLS-Boosting, NP-Boosting, BL and BayesA, respectively. The smallest bias and mean-squared errors (MSEs) were obtained with OLS-Boosting in both the dairy cattle (0.08 and 1.08, respectively) and broiler (-0.011 and 0.006) data sets, respectively. In the dairy cattle data set, the BL was more accurate (bias=0.10 and MSE=1.10) than BayesA (bias=1.26 and MSE=2.81), whereas no differences between these two methods were found in the broiler data set. L2-Boosting with a suitable learner was found to be a competitive alternative for genomic selection applications, providing high accuracy and low bias in genomic-assisted evaluations with a relatively short computational time.

摘要

L(2)-Boosting算法是近几十年来出现的最有前景的机器学习技术之一。它可应用于全基因组研究等高维问题,并且从计算角度来看相对简单。在本研究中,我们在基因组选择背景下使用该算法对尚未观察到的结果进行预测。使用了两个数据集:(1) 4702头荷斯坦公牛的生产寿命预测传递能力,这些公牛针对来自Illumina BovineSNP50 BeadChip的32611个单核苷酸多态性(SNP)进行了基因分型;(2) 394只肉鸡的食物转化率后代平均值,该平均值经环境和配偶效应预校正,这些肉鸡针对3481个SNP进行了基因分型。每个数据集都被分为训练集和测试集,测试集包含其祖先在训练集中的奶牛或肉鸡公牛。L2-Boosting算法使用了两个弱学习器,即普通最小二乘法(OLS)和非参数(NP)回归,以对该过程进行严格评估。将该算法与贝叶斯LASSO(最小绝对收缩和选择算子)(BL)和贝叶斯A回归进行了比较。在训练集中执行学习任务,而在测试集中对模型进行验证。在奶牛(肉鸡)数据集中,OLS-Boosting、NP-Boosting、BL和贝叶斯A的预测响应与观察响应之间的皮尔逊相关系数分别为0.65(0.33)、0.53(0.37)、0.66(0.26)和0.63(0.27)。在奶牛(分别为0.08和1.08)和肉鸡(-0.011和0.006)数据集中,OLS-Boosting获得的偏差和均方误差(MSE)最小。在奶牛数据集中,BL(偏差=0.10,MSE=1.10)比贝叶斯A(偏差=1.26,MSE=2.81)更准确,而在肉鸡数据集中未发现这两种方法之间存在差异。发现使用合适学习器的L2-Boosting是基因组选择应用的一种有竞争力的替代方法,在基因组辅助评估中提供高精度和低偏差且计算时间相对较短。

相似文献

1
L2-Boosting algorithm applied to high-dimensional problems in genomic selection.应用于基因组选择中高维问题的L2增强算法。
Genet Res (Camb). 2010 Jun;92(3):227-37. doi: 10.1017/S0016672310000261.
2
The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.梯度提升算法和随机提升在大数据集中的基因组辅助评估中的应用。
J Dairy Sci. 2013 Jan;96(1):614-24. doi: 10.3168/jds.2012-5630. Epub 2012 Oct 24.
3
Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.比较基因组辅助评估西班牙奶牛的方法。
J Dairy Sci. 2013 Jan;96(1):625-34. doi: 10.3168/jds.2012-5631. Epub 2012 Oct 24.
4
Accuracy of direct genomic values derived from imputed single nucleotide polymorphism genotypes in Jersey cattle.从泽西牛的单核苷酸多态性基因型中推断得到的直接基因组值的准确性。
J Dairy Sci. 2010 Nov;93(11):5423-35. doi: 10.3168/jds.2010-3149.
5
LASSO with cross-validation for genomic selection.用于基因组选择的带交叉验证的套索算法。
Genet Res (Camb). 2009 Dec;91(6):427-36. doi: 10.1017/S0016672309990334.
6
Assets of imputation to ultra-high density for productive and functional traits.超高密度标记的生产和功能性状的应用价值。
J Dairy Sci. 2013 Sep;96(9):6047-58. doi: 10.3168/jds.2013-6793. Epub 2013 Jun 28.
7
Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers.基因组选择中用于选择单核苷酸多态性(SNP)的机器学习分类程序:在肉鸡早期死亡率中的应用
J Anim Breed Genet. 2007 Dec;124(6):377-89. doi: 10.1111/j.1439-0388.2007.00694.x.
8
A simple method for genomic selection of moderately sized dairy cattle populations.一种用于中大规模奶牛群体基因组选择的简单方法。
Animal. 2012 Feb;6(2):193-202. doi: 10.1017/S1751731111001704.
9
Application of Bayesian least absolute shrinkage and selection operator (LASSO) and BayesCπ methods for genomic selection in French Holstein and Montbéliarde breeds.贝叶斯最小绝对收缩和选择算子(LASSO)和 BayesCπ 方法在法国荷斯坦和蒙贝利亚德品种基因组选择中的应用。
J Dairy Sci. 2013 Jan;96(1):575-91. doi: 10.3168/jds.2011-5225. Epub 2012 Nov 3.
10
Prediction of breed composition in an admixed cattle population.混合牛种群的品种组成预测。
Anim Genet. 2012 Dec;43(6):696-703. doi: 10.1111/j.1365-2052.2012.02345.x. Epub 2012 Mar 23.

引用本文的文献

1
Accurate prediction of quantitative traits with failed SNP calls in canola and maize.在油菜和玉米中对存在SNP调用失败情况的数量性状进行准确预测。
Front Plant Sci. 2023 Oct 23;14:1221750. doi: 10.3389/fpls.2023.1221750. eCollection 2023.
2
A review of machine learning models applied to genomic prediction in animal breeding.应用于动物育种基因组预测的机器学习模型综述。
Front Genet. 2023 Sep 6;14:1150596. doi: 10.3389/fgene.2023.1150596. eCollection 2023.
3
Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction.
植物育种数字化:基于基因组预测的新一代育种新趋势。
Front Plant Sci. 2023 Jan 19;14:1092584. doi: 10.3389/fpls.2023.1092584. eCollection 2023.
4
SCALAR ON NETWORK REGRESSION VIA BOOSTING.基于提升法的网络回归中的标量
Ann Appl Stat. 2022 Dec;16(4):2755-2773. doi: 10.1214/22-aoas1612. Epub 2022 Sep 26.
5
Adding gene transcripts into genomic prediction improves accuracy and reveals sampling time dependence.添加基因转录本可提高基因组预测的准确性,并揭示采样时间的依赖性。
G3 (Bethesda). 2022 Nov 4;12(11). doi: 10.1093/g3journal/jkac258.
6
Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice.线性模型和梯度提升机在外交小鼠复杂表型上的预测性能。
G3 (Bethesda). 2022 Apr 4;12(4). doi: 10.1093/g3journal/jkac039.
7
Review: Application and Prospective Discussion of Machine Learning for the Management of Dairy Farms.综述:机器学习在奶牛场管理中的应用及前景探讨
Animals (Basel). 2020 Sep 18;10(9):1690. doi: 10.3390/ani10091690.
8
Genomic Prediction for 25 Agronomic and Quality Traits in Alfalfa ().紫花苜蓿25个农艺和品质性状的基因组预测()。
Front Plant Sci. 2018 Aug 20;9:1220. doi: 10.3389/fpls.2018.01220. eCollection 2018.
9
Whole-genome regression and prediction methods applied to plant and animal breeding.全基因组回归和预测方法在动植物育种中的应用。
Genetics. 2013 Feb;193(2):327-45. doi: 10.1534/genetics.112.143313. Epub 2012 Jun 28.
10
Genome-wide prediction of discrete traits using Bayesian regressions and machine learning.基于贝叶斯回归和机器学习的全基因组离散性状预测。
Genet Sel Evol. 2011 Feb 17;43(1):7. doi: 10.1186/1297-9686-43-7.