• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因组选择的带交叉验证的套索算法。

LASSO with cross-validation for genomic selection.

作者信息

Usai M Graziano, Goddard Mike E, Hayes Ben J

机构信息

Settore Genetica e Biotecnologie, AGRIS-Sardegna, Olmedo 07040, Italy.

出版信息

Genet Res (Camb). 2009 Dec;91(6):427-36. doi: 10.1017/S0016672309990334.

DOI:10.1017/S0016672309990334
PMID:20122298
Abstract

We used a least absolute shrinkage and selection operator (LASSO) approach to estimate marker effects for genomic selection. The least angle regression (LARS) algorithm and cross-validation were used to define the best subset of markers to include in the model. The LASSO-LARS approach was tested on two data sets: a simulated data set with 5865 individuals and 6000 Single Nucleotide Polymorphisms (SNPs); and a mouse data set with 1885 individuals genotyped for 10 656 SNPs and phenotyped for a number of quantitative traits. In the simulated data, three approaches were used to split the reference population into training and validation subsets for cross-validation: random splitting across the whole population; random sampling of validation set from the last generation only, either within or across families. The highest accuracy was obtained by random splitting across the whole population. The accuracy of genomic estimated breeding values (GEBVs) in the candidate population obtained by LASSO-LARS was 0.89 with 156 explanatory SNPs. This value was higher than those obtained by Best Linear Unbiased Prediction (BLUP) and a Bayesian method (BayesA), which were 0.75 and 0.84, respectively. In the mouse data, 1600 individuals were randomly allocated to the reference population. The GEBVs for the remaining 285 individuals estimated by LASSO-LARS were more accurate than those obtained by BLUP and BayesA for weight at six weeks and slightly lower for growth rate and body length. It was concluded that LASSO-LARS approach is a good alternative method to estimate marker effects for genomic selection, particularly when the cost of genotyping can be reduced by using a limited subset of markers.

摘要

我们使用最小绝对收缩与选择算子(LASSO)方法来估计基因组选择的标记效应。采用最小角回归(LARS)算法和交叉验证来确定纳入模型的最佳标记子集。在两个数据集上对LASSO-LARS方法进行了测试:一个模拟数据集,包含5865个个体和6000个单核苷酸多态性(SNP);一个小鼠数据集,有1885个个体,对10656个SNP进行了基因分型,并对多个数量性状进行了表型分析。在模拟数据中,使用了三种方法将参考群体划分为训练集和验证集以进行交叉验证:在整个群体中随机划分;仅从最后一代中随机抽取验证集,可在家族内或跨家族抽取。通过在整个群体中随机划分获得了最高的准确性。通过LASSO-LARS在候选群体中获得的基因组估计育种值(GEBV)的准确性为0.89,使用了156个解释性SNP。该值高于通过最佳线性无偏预测(BLUP)和贝叶斯方法(BayesA)获得的值,后者分别为0.75和0.84。在小鼠数据中,1600个个体被随机分配到参考群体。LASSO-LARS估计的其余285个个体在六周龄体重方面的GEBV比通过BLUP和BayesA获得的更准确,在生长速率和体长方面略低。得出的结论是,LASSO-LARS方法是估计基因组选择标记效应的一种很好的替代方法,特别是当通过使用有限的标记子集可以降低基因分型成本时。

相似文献

1
LASSO with cross-validation for genomic selection.用于基因组选择的带交叉验证的套索算法。
Genet Res (Camb). 2009 Dec;91(6):427-36. doi: 10.1017/S0016672309990334.
2
L2-Boosting algorithm applied to high-dimensional problems in genomic selection.应用于基因组选择中高维问题的L2增强算法。
Genet Res (Camb). 2010 Jun;92(3):227-37. doi: 10.1017/S0016672310000261.
3
Application of Bayesian least absolute shrinkage and selection operator (LASSO) and BayesCπ methods for genomic selection in French Holstein and Montbéliarde breeds.贝叶斯最小绝对收缩和选择算子(LASSO)和 BayesCπ 方法在法国荷斯坦和蒙贝利亚德品种基因组选择中的应用。
J Dairy Sci. 2013 Jan;96(1):575-91. doi: 10.3168/jds.2011-5225. Epub 2012 Nov 3.
4
Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.比较基因组辅助评估西班牙奶牛的方法。
J Dairy Sci. 2013 Jan;96(1):625-34. doi: 10.3168/jds.2012-5631. Epub 2012 Oct 24.
5
Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions.使用正则化线性回归模型的基因组选择:岭回归、套索回归、弹性网络及其扩展。
BMC Proc. 2012 May 21;6 Suppl 2(Suppl 2):S10. doi: 10.1186/1753-6561-6-S2-S10.
6
Effects of marker density and population structure on the genomic prediction accuracy for growth trait in Pacific white shrimp Litopenaeus vannamei.标记密度和群体结构对凡纳滨对虾生长性状基因组预测准确性的影响
BMC Genet. 2017 May 17;18(1):45. doi: 10.1186/s12863-017-0507-5.
7
Accuracy of genomic selection for a sib-evaluated trait using identity-by-state and identity-by-descent relationships.利用状态一致性和系谱一致性关系对同胞评估性状进行基因组选择的准确性。
Genet Sel Evol. 2015 Feb 25;47(1):9. doi: 10.1186/s12711-014-0084-2.
8
Comparison of Genomic Selection Models to Predict Flowering Time and Spike Grain Number in Two Hexaploid Wheat Doubled Haploid Populations.预测两个六倍体小麦双单倍体群体开花时间和穗粒数的基因组选择模型比较
G3 (Bethesda). 2015 Jul 22;5(10):1991-8. doi: 10.1534/g3.115.019745.
9
Ridge, Lasso and Bayesian additive-dominance genomic models.岭回归、套索回归和贝叶斯加性显性基因组模型。
BMC Genet. 2015 Aug 25;16:105. doi: 10.1186/s12863-015-0264-2.
10
Alternative strategies for selecting subsets of predicting SNPs by LASSO-LARS procedure.通过套索-最小角回归(LASSO-LARS)程序选择预测性单核苷酸多态性(SNP)子集的替代策略。
BMC Proc. 2012 May 21;6 Suppl 2(Suppl 2):S9. doi: 10.1186/1753-6561-6-S2-S9.

引用本文的文献

1
GPS: Harnessing data fusion strategies to improve the accuracy of machine learning-based genomic and phenotypic selection.GPS:利用数据融合策略提高基于机器学习的基因组和表型选择的准确性。
Plant Commun. 2025 Aug 11;6(8):101416. doi: 10.1016/j.xplc.2025.101416. Epub 2025 Jun 11.
2
ShinyGS-a graphical toolkit with a serial of genetic and machine learning models for genomic selection: application, benchmarking, and recommendations.ShinyGS——一个带有一系列用于基因组选择的遗传和机器学习模型的图形工具包:应用、基准测试及建议
Front Plant Sci. 2024 Dec 24;15:1480902. doi: 10.3389/fpls.2024.1480902. eCollection 2024.
3
Validation of cross-progeny variance genomic prediction using simulations and experimental data in winter elite bread wheat.
利用模拟和实验数据对冬性优质小麦的跨世代方差基因组预测进行验证。
Theor Appl Genet. 2024 Sep 18;137(10):226. doi: 10.1007/s00122-024-04718-6.
4
Identification of Schizophrenia Susceptibility Loci in the Urban Taiwanese Population.鉴定台湾城市人群中的精神分裂症易感基因座。
Medicina (Kaunas). 2024 Aug 6;60(8):1271. doi: 10.3390/medicina60081271.
5
Prediction of resistance, virulence, and host-by-pathogen interactions using dual-genome prediction models.使用双基因组预测模型预测耐药性、毒力和宿主-病原体相互作用。
Theor Appl Genet. 2024 Aug 6;137(8):196. doi: 10.1007/s00122-024-04698-7.
6
SABO-ILSTSVR: a genomic prediction method based on improved least squares twin support vector regression.SABO-ILSTSVR:一种基于改进最小二乘孪生支持向量回归的基因组预测方法。
Front Genet. 2024 Jun 14;15:1415249. doi: 10.3389/fgene.2024.1415249. eCollection 2024.
7
Investigating genomic prediction strategies for grain carotenoid traits in a tropical/subtropical maize panel.研究热带/亚热带玉米群体中谷物类胡萝卜素性状的基因组预测策略。
G3 (Bethesda). 2024 May 7;14(5). doi: 10.1093/g3journal/jkae044.
8
Genomic prediction for agronomic traits in a diverse Flax (Linum usitatissimum L.) germplasm collection.在一个多样化的亚麻(Linum usitatissimum L.)种质资源收集群体中进行农艺性状的基因组预测。
Sci Rep. 2024 Feb 8;14(1):3196. doi: 10.1038/s41598-024-53462-w.
9
Review of applications of artificial intelligence (AI) methods in crop research.人工智能(AI)方法在作物研究中的应用综述。
J Appl Genet. 2024 May;65(2):225-240. doi: 10.1007/s13353-023-00826-z. Epub 2024 Jan 13.
10
Evaluating the Effectiveness of 2D and 3D CT Image Features for Predicting Tumor Response to Chemotherapy.评估二维和三维CT图像特征预测肿瘤化疗反应的有效性。
Bioengineering (Basel). 2023 Nov 20;10(11):1334. doi: 10.3390/bioengineering10111334.