• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于惩罚回归聚类的新算法与理论

A New Algorithm and Theory for Penalized Regression-based Clustering.

作者信息

Wu Chong, Kwon Sunghoon, Shen Xiaotong, Pan Wei

机构信息

Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA.

Department of Applied Statistics, Konkuk University, Seoul, South Korea School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA.

出版信息

J Mach Learn Res. 2016;17.

PMID:31662706
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6818515/
Abstract

Clustering is unsupervised and exploratory in nature. Yet, it can be performed through penalized regression with grouping pursuit, as demonstrated in Pan et al. (2013). In this paper, we develop a more efficient algorithm for scalable computation and a new theory of clustering consistency for the method. This algorithm, called DC-ADMM, combines difference of convex (DC) programming with the alternating direction method of multipliers (ADMM). This algorithm is shown to be more computationally efficient than the quadratic penalty based algorithm of Pan et al. (2013) because of the former's closed-form updating formulas. Numerically, we compare the DC-ADMM algorithm with the quadratic penalty algorithm to demonstrate its utility and scalability. Theoretically, we establish a finite-sample mis-clustering error bound for penalized regression based clustering with the constrained regularization in a general setting. On this ground, we provide conditions for clustering consistency of the penalized clustering method. As an end product, we put R package implementing PRclust with various loss and grouping penalty functions available on GitHub and CRAN.

摘要

聚类本质上是无监督且探索性的。然而,正如Pan等人(2013年)所展示的,它可以通过带分组追踪的惩罚回归来执行。在本文中,我们为可扩展计算开发了一种更高效的算法,并为该方法建立了一种新的聚类一致性理论。这种算法称为DC - ADMM,它将凸差(DC)规划与乘子交替方向法(ADMM)相结合。由于前者具有闭式更新公式,该算法在计算上比Pan等人(2013年)基于二次惩罚的算法更高效。在数值上,我们将DC - ADMM算法与二次惩罚算法进行比较,以证明其效用和可扩展性。在理论上,我们在一般情况下为基于惩罚回归的聚类建立了一个有限样本误聚类误差界,其中带有约束正则化。在此基础上,我们为惩罚聚类方法的聚类一致性提供了条件。作为最终成果,我们在GitHub和CRAN上发布了实现PRclust的R包,其中包含各种损失和分组惩罚函数可用。

相似文献

1
A New Algorithm and Theory for Penalized Regression-based Clustering.一种基于惩罚回归聚类的新算法与理论
J Mach Learn Res. 2016;17.
2
Algorithms for Fitting the Constrained Lasso.用于拟合约束套索的算法
J Comput Graph Stat. 2018;27(4):861-871. doi: 10.1080/10618600.2018.1473777. Epub 2018 Aug 7.
3
Efficient ℓ -norm feature selection based on augmented and penalized minimization.基于增广和惩罚最小化的高效 ℓ -范数特征选择。
Stat Med. 2018 Feb 10;37(3):473-486. doi: 10.1002/sim.7526. Epub 2017 Oct 30.
4
Splitting Methods for Convex Clustering.凸聚类的分裂方法
J Comput Graph Stat. 2015;24(4):994-1013. doi: 10.1080/10618600.2014.948181. Epub 2015 Dec 10.
5
Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty.聚类分析:通过带有非凸惩罚项的监督学习实现无监督学习
J Mach Learn Res. 2013 Jul 1;14(7):1865.
6
A Path Algorithm for Constrained Estimation.一种用于约束估计的路径算法。
J Comput Graph Stat. 2013;22(2):261-283. doi: 10.1080/10618600.2012.681248.
7
Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix-variate fMRI data.同时学习簇结构和估计矩阵变量 fMRI 数据的异质图。
Biometrics. 2023 Sep;79(3):2246-2259. doi: 10.1111/biom.13753. Epub 2022 Sep 13.
8
The convergence rate of the proximal alternating direction method of multipliers with indefinite proximal regularization.具有不定近端正则化的近端交替方向乘子法的收敛速度
J Inequal Appl. 2017;2017(1):19. doi: 10.1186/s13660-017-1295-1. Epub 2017 Jan 14.
9
Extensions to the Proximal Distance Method of Constrained Optimization.约束优化近端距离法的扩展
J Mach Learn Res. 2022;23.
10
Identification of gene pairs through penalized regression subject to constraints.通过受约束的惩罚回归识别基因对。
BMC Bioinformatics. 2017 Nov 3;18(1):466. doi: 10.1186/s12859-017-1872-9.

引用本文的文献

1
Biclustering analysis of functionals via penalized fusion.基于惩罚融合的功能双聚类分析
J Multivar Anal. 2022 May;189. doi: 10.1016/j.jmva.2021.104874. Epub 2021 Oct 29.
2
Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix-variate fMRI data.同时学习簇结构和估计矩阵变量 fMRI 数据的异质图。
Biometrics. 2023 Sep;79(3):2246-2259. doi: 10.1111/biom.13753. Epub 2022 Sep 13.
3
A New Semiparametric Approach to Finite Mixture of Regressions using Penalized Regression via Fusion.一种基于融合惩罚回归的有限混合回归新半参数方法。
Stat Sin. 2020 Apr;30(2):783-807. doi: 10.5705/ss.202016.0531.
4
Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data.混合多视图数据的集成广义凸聚类优化与特征选择
J Mach Learn Res. 2021 Jan;22.
5
Provable Convex Co-clustering of Tensors.张量的可证凸共聚类
J Mach Learn Res. 2020;21.

本文引用的文献

1
Splitting Methods for Convex Clustering.凸聚类的分裂方法
J Comput Graph Stat. 2015;24(4):994-1013. doi: 10.1080/10618600.2014.948181. Epub 2015 Dec 10.
2
Integrative and regularized principal component analysis of multiple sources of data.多源数据的整合与正则化主成分分析
Stat Med. 2016 Jun 15;35(13):2235-50. doi: 10.1002/sim.6866. Epub 2016 Jan 12.
3
Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty.聚类分析:通过带有非凸惩罚项的监督学习实现无监督学习
J Mach Learn Res. 2013 Jul 1;14(7):1865.
4
Likelihood-based selection and sharp parameter estimation.基于似然性的选择与精确参数估计。
J Am Stat Assoc. 2012 Jan 1;107(497):223-232. doi: 10.1080/01621459.2011.645783. Epub 2012 Jun 11.
5
K-means clustering: a half-century synthesis.K均值聚类:半个世纪的综述
Br J Math Stat Psychol. 2006 May;59(Pt 1):1-34. doi: 10.1348/000711005X48266.
6
Survey of clustering algorithms.聚类算法综述
IEEE Trans Neural Netw. 2005 May;16(3):645-78. doi: 10.1109/TNN.2005.845141.