Suppr超能文献

基于高维模型的聚类的成对变量选择。

Pairwise variable selection for high-dimensional model-based clustering.

作者信息

Guo Jian, Levina Elizaveta, Michailidis George, Zhu Ji

机构信息

Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA.

出版信息

Biometrics. 2010 Sep;66(3):793-804. doi: 10.1111/j.1541-0420.2009.01341.x.

Abstract

Variable selection for clustering is an important and challenging problem in high-dimensional data analysis. Existing variable selection methods for model-based clustering select informative variables in a "one-in-all-out" manner; that is, a variable is selected if at least one pair of clusters is separable by this variable and removed if it cannot separate any of the clusters. In many applications, however, it is of interest to further establish exactly which clusters are separable by each informative variable. To address this question, we propose a pairwise variable selection method for high-dimensional model-based clustering. The method is based on a new pairwise penalty. Results on simulated and real data show that the new method performs better than alternative approaches that use ℓ(1) and ℓ(∞) penalties and offers better interpretation.

摘要

聚类的变量选择是高维数据分析中的一个重要且具有挑战性的问题。现有的基于模型聚类的变量选择方法以“逐一进出”的方式选择信息变量;也就是说,如果至少有一对聚类可以通过该变量分离,则选择该变量,如果它不能分离任何聚类,则将其删除。然而,在许多应用中,进一步确定每个信息变量可以分离哪些聚类是很有意义的。为了解决这个问题,我们提出了一种用于基于高维模型聚类的成对变量选择方法。该方法基于一种新的成对惩罚。模拟数据和真实数据的结果表明,新方法比使用ℓ(1)和ℓ(∞)惩罚的替代方法表现更好,并且具有更好的解释性。

相似文献

4
Variable selection for clustering with Gaussian mixture models.用于高斯混合模型聚类的变量选择
Biometrics. 2009 Sep;65(3):701-9. doi: 10.1111/j.1541-0420.2008.01160.x. Epub 2009 Feb 4.
8
Understanding and enhancement of internal clustering validation measures.理解和增强内部聚类验证措施。
IEEE Trans Cybern. 2013 Jun;43(3):982-94. doi: 10.1109/TSMCB.2012.2220543. Epub 2012 Oct 26.
9

引用本文的文献

1
Heterogeneous Functional Regression for Subgroup Analysis.用于亚组分析的异质性功能回归
J Comput Graph Stat. 2024 Dec 20. doi: 10.1080/10618600.2024.2414113.
2
Regression Trees With Fused Leaves.带融合叶的回归树
Stat Med. 2024 Dec 30;43(30):5872-5884. doi: 10.1002/sim.10272. Epub 2024 Nov 20.
6
Integrative clustering methods for multi-omics data.多组学数据的整合聚类方法。
Wiley Interdiscip Rev Comput Stat. 2022 May-Jun;14(3). doi: 10.1002/wics.1553. Epub 2021 Feb 7.
8
Covariance-enhanced discriminant analysis.协方差增强判别分析
Biometrika. 2015;102(1):33-45. doi: 10.1093/biomet/asu049. Epub 2014 Dec 3.
9
Clustering High-Dimensional Landmark-based Two-dimensional Shape Data.基于高维地标点的二维形状数据聚类
J Am Stat Assoc. 2015 Nov 7;110(115):946-961. doi: 10.1080/01621459.2015.1034802. Epub 2015 Apr 16.

本文引用的文献

2
Variable Selection using MM Algorithms.使用MM算法进行变量选择
Ann Stat. 2005;33(4):1617-1642. doi: 10.1214/009053605000000200.
3
Simultaneous factor selection and collapsing levels in ANOVA.方差分析中的同时因子选择与水平合并
Biometrics. 2009 Mar;65(1):169-77. doi: 10.1111/j.1541-0420.2008.01061.x. Epub 2008 May 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验