Suppr超能文献

clustvarsel:一个在R语言中为基于高斯模型的聚类实现变量选择的程序包。

clustvarsel: A Package Implementing Variable Selection for Gaussian Model-Based Clustering in R.

作者信息

Scrucca Luca, Raftery Adrian E

机构信息

Department of Economics, Università degli Studi di Perugia, Via A. Pascoli, 20, 06123 Perugia, Italy, URL: http://www.stat.unipg.it/luca.

Department of Statistics, University of Washington, Box 354320, Seattle, WA 98195-4320, United States of America, URL: http://www.stat.washington.edu/raftery/.

出版信息

J Stat Softw. 2018 Apr;84. doi: 10.18637/jss.v084.i01. Epub 2018 Apr 17.

Abstract

Finite mixture modeling provides a framework for cluster analysis based on parsimonious Gaussian mixture models. Variable or feature selection is of particular importance in situations where only a subset of the available variables provide clustering information. This enables the selection of a more parsimonious model, yielding more efficient estimates, a clearer interpretation and, often, improved clustering partitions. This paper describes the R package which performs subset selection for model-based clustering. An improved version of the Raftery and Dean (2006) methodology is implemented in the new release of the package to find the (locally) optimal subset of variables with group/cluster information in a dataset. Search over the solution space is performed using either a step-wise greedy search or a headlong algorithm. Adjustments for speeding up these algorithms are discussed, as well as a parallel implementation of the stepwise search. Usage of the package is presented through the discussion of several data examples.

摘要

有限混合模型为基于简约高斯混合模型的聚类分析提供了一个框架。在只有一部分可用变量提供聚类信息的情况下,变量或特征选择尤为重要。这使得能够选择一个更简约的模型,从而产生更有效的估计、更清晰的解释,并且通常能改进聚类划分。本文描述了一个用于基于模型的聚类进行子集选择的R包。该包的新版本实现了Raftery和Dean(2006)方法的改进版本,以在数据集中找到具有组/聚类信息的(局部)最优变量子集。使用逐步贪婪搜索或莽撞算法在解空间中进行搜索。讨论了加速这些算法的调整方法,以及逐步搜索的并行实现。通过几个数据示例的讨论展示了该包的用法。

相似文献

4
Variable selection for clustering with Gaussian mixture models.用于高斯混合模型聚类的变量选择
Biometrics. 2009 Sep;65(3):701-9. doi: 10.1111/j.1541-0420.2008.01160.x. Epub 2009 Feb 4.

引用本文的文献

9
Population density and spreading of COVID-19 in England and Wales.英格兰和威尔士的人口密度与 COVID-19 的传播。
PLoS One. 2022 Mar 31;17(3):e0261725. doi: 10.1371/journal.pone.0261725. eCollection 2022.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验