• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

海鸥算法:通过近端梯度下降法实现线性回归模型的套索、组套索和稀疏组套索正则化。

Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent.

机构信息

Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany.

Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.

出版信息

BMC Bioinformatics. 2020 Sep 15;21(1):407. doi: 10.1186/s12859-020-03725-w.

DOI:10.1186/s12859-020-03725-w
PMID:32933477
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7493359/
Abstract

BACKGROUND

Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths.

RESULTS

Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature.

CONCLUSIONS

The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.

摘要

背景

生命科学中的生物问题的统计分析通常会导致高维线性模型。为了解决相应的方程组,惩罚方法通常是首选方法。如果解释变量的数量超过观测值的数量,或者出于某些生物学原因,出现多重共线性时,它们特别有用。然后,通过适当的感兴趣的函数来惩罚模型拟合优度。突出的例子是lasso、group lasso 和 sparse-group lasso。在这里,我们通过近端梯度下降为这些运算符提供了快速且数值上便宜的实现。通过 warm starts 实现了针对惩罚参数的网格搜索。通过回溯线搜索确定连续迭代之间的步长。最后,这里介绍的 R 包 seagull 生成完整的正则化路径。

结果

使用公开的高维甲基化数据将 seagull 与成熟的 R 包 SGL 进行比较。这两个包的结果都能够从 DNA 甲基化状态准确预测生物年龄。但是,即使 seagull 和 SGL 的结果非常相似(R>0.99),seagull 的计算时间也只是 SGL 的一小部分。此外,seagull 还可以为每个惩罚特征添加权重。

结论

seagull 中提供了以下用于线性回归模型的运算符:lasso、group lasso、sparse-group lasso 和带有惩罚因子的集成 LASSO(IPF-lasso)。因此,seagull 是 lasso 变体的便捷封装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe8/7493359/f442a72c6968/12859_2020_3725_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe8/7493359/96f5dc72a3f3/12859_2020_3725_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe8/7493359/f442a72c6968/12859_2020_3725_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe8/7493359/96f5dc72a3f3/12859_2020_3725_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe8/7493359/f442a72c6968/12859_2020_3725_Fig2_HTML.jpg

相似文献

1
Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent.海鸥算法:通过近端梯度下降法实现线性回归模型的套索、组套索和稀疏组套索正则化。
BMC Bioinformatics. 2020 Sep 15;21(1):407. doi: 10.1186/s12859-020-03725-w.
2
A Fitted Sparse-Group Lasso for Genome-Based Evaluations.一种用于基于基因组评估的拟合稀疏组套索法。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):30-38. doi: 10.1109/TCBB.2022.3156805. Epub 2023 Feb 3.
3
IPF-LASSO: Integrative -Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data.IPF-LASSO:基于多组学数据的带惩罚因子的整合惩罚回归用于预测
Comput Math Methods Med. 2017;2017:7691937. doi: 10.1155/2017/7691937. Epub 2017 May 4.
4
Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models.在高维惩罚 Cox 回归模型中考虑分组预测变量或途径。
BMC Bioinformatics. 2020 Jul 2;21(1):277. doi: 10.1186/s12859-020-03618-y.
5
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.弹性 SCAD 作为一种新的惩罚方法,用于高维数据中的 SVM 分类任务。
BMC Bioinformatics. 2011 May 9;12:138. doi: 10.1186/1471-2105-12-138.
6
The graphical lasso: New insights and alternatives.图形套索:新见解与替代方法。
Electron J Stat. 2012 Nov 9;6:2125-2149. doi: 10.1214/12-EJS740.
7
STANDARDIZATION AND THE GROUP LASSO PENALTY.标准化与组套索惩罚
Stat Sin. 2012 Jul;22(3):983-1001. doi: 10.5705/ss.2011.075.
8
Combining Sparse Group Lasso and Linear Mixed Model Improves Power to Detect Genetic Variants Underlying Quantitative Traits.结合稀疏组套索和线性混合模型可提高检测数量性状潜在遗传变异的效能。
Front Genet. 2019 Apr 10;10:271. doi: 10.3389/fgene.2019.00271. eCollection 2019.
9
Clustering in linear-mixed models with a group fused lasso penalty.具有组融合套索惩罚的线性混合模型中的聚类
Biom J. 2014 Jan;56(1):44-68. doi: 10.1002/bimj.201200111. Epub 2013 Nov 18.
10
Efficient methods for overlapping group lasso.重叠群组套索的有效方法。
IEEE Trans Pattern Anal Mach Intell. 2013 Sep;35(9):2104-16. doi: 10.1109/TPAMI.2013.17.

引用本文的文献

1
Multi-omics exploration of chaperone-mediated immune-proteostasis crosstalk in vascular dementia and identification of diagnostic biomarkers.血管性痴呆中伴侣蛋白介导的免疫-蛋白稳态串扰的多组学探索及诊断生物标志物的鉴定
Front Immunol. 2025 Jul 30;16:1615540. doi: 10.3389/fimmu.2025.1615540. eCollection 2025.
2
Transfer Learning for Error-Contaminated Poisson Regression Models.误差污染泊松回归模型的迁移学习
Stat Med. 2025 Jul;44(15-17):e70163. doi: 10.1002/sim.70163.
3
MyESL: Sparse learning in molecular evolution and phylogenetic analysis.

本文引用的文献

1
Genetic Variants Detection Based on Weighted Sparse Group Lasso.基于加权稀疏组套索的基因变异检测
Front Genet. 2020 Mar 3;11:155. doi: 10.3389/fgene.2020.00155. eCollection 2020.
2
DNA methylation aging clocks: challenges and recommendations.DNA 甲基化衰老钟:挑战与建议。
Genome Biol. 2019 Nov 25;20(1):249. doi: 10.1186/s13059-019-1824-y.
3
IPF-LASSO: Integrative -Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data.IPF-LASSO:基于多组学数据的带惩罚因子的整合惩罚回归用于预测
MyESL:分子进化与系统发育分析中的稀疏学习
ArXiv. 2025 Jan 9:arXiv:2501.04941v1.
4
Causality-driven candidate identification for reliable DNA methylation biomarker discovery.用于可靠DNA甲基化生物标志物发现的因果关系驱动的候选物识别
Nat Commun. 2025 Jan 15;16(1):680. doi: 10.1038/s41467-025-56054-y.
5
HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data.HighDimMixedModels.jl:跨组学数据的稳健高维混合效应模型。
PLoS Comput Biol. 2025 Jan 13;21(1):e1012143. doi: 10.1371/journal.pcbi.1012143. eCollection 2025 Jan.
6
Deconstructing Intratumoral Heterogeneity through Multiomic and Multiscale Analysis of Serial Sections.通过对连续切片进行多组学和多尺度分析解构肿瘤内异质性
Cancers (Basel). 2024 Jul 1;16(13):2429. doi: 10.3390/cancers16132429.
7
Deconstructing intratumoral heterogeneity through multiomic and multiscale analysis of serial sections.通过对连续切片进行多组学和多尺度分析来解构肿瘤内异质性。
bioRxiv. 2024 Mar 18:2023.06.21.545365. doi: 10.1101/2023.06.21.545365.
8
Factors related to early and rapid assessment of in-hospital mortality among older adult trauma patients in an earthquake.地震中老年创伤患者院内死亡早期快速评估的相关因素
World J Emerg Med. 2022;13(6):425-432. doi: 10.5847/wjem.j.1920-8642.2022.099.
9
A Novel Algorithm for Feature Selection Using Penalized Regression with Applications to Single-Cell RNA Sequencing Data.一种使用惩罚回归进行特征选择的新算法及其在单细胞RNA测序数据中的应用
Biology (Basel). 2022 Oct 12;11(10):1495. doi: 10.3390/biology11101495.
10
LTBP2 inhibits prostate cancer progression and metastasis via the PI3K/AKT signaling pathway.LTBP2通过PI3K/AKT信号通路抑制前列腺癌的进展和转移。
Exp Ther Med. 2022 Jul 8;24(3):563. doi: 10.3892/etm.2022.11500. eCollection 2022 Sep.
Comput Math Methods Med. 2017;2017:7691937. doi: 10.1155/2017/7691937. Epub 2017 May 4.
4
Using DNA Methylation Profiling to Evaluate Biological Age and Longevity Interventions.利用DNA甲基化分析评估生物年龄和长寿干预措施。
Cell Metab. 2017 Apr 4;25(4):954-960.e6. doi: 10.1016/j.cmet.2017.03.016.