基于全多维标度的距离矩阵 K-均值聚类行为研究。

On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling.

机构信息

Unidad Monterey, University of Granada, Granada, Spain.

Centro De Investigación En Matemáticas, Unidad Monterrey, Mexico.

出版信息

Psychometrika. 2021 Jun;86(2):489-513. doi: 10.1007/s11336-021-09757-2. Epub 2021 May 19.

DOI:10.1007/s11336-021-09757-2

PMID:34008128

Abstract

In this article, we analyse the usefulness of multidimensional scaling in relation to performing K-means clustering on a dissimilarity matrix, when the dimensionality of the objects is unknown. In this situation, traditional algorithms cannot be used, and so K-means clustering procedures are being performed directly on the basis of the observed dissimilarity matrix. Furthermore, the application of criteria originally formulated for two-mode data sets to determine the number of clusters depends on their possible reformulation in a one-mode situation. The linear invariance property in K-means clustering for squared dissimilarities, together with the use of multidimensional scaling, is investigated to determine the cluster membership of the observations and to address the problem of selecting the number of clusters in K-means for a dissimilarity matrix. In particular, we analyse the performance of K-means clustering on the full dimensional scaling configuration and on the equivalently partitioned configuration related to a suitable translation of the squared dissimilarities. A Monte Carlo experiment is conducted in which the methodology examined is compared with the results obtained by procedures directly applicable to a dissimilarity matrix.

摘要

在本文中，我们分析了多维尺度分析在多维未知的情况下与对距离矩阵执行 K-均值聚类的相关性。在这种情况下，不能使用传统算法，因此直接在观察到的距离矩阵的基础上执行 K-均值聚类程序。此外，最初为双模数据集制定的标准应用于确定聚类数，取决于它们在单模情况下的可能重新制定。对平方距离的 K-均值聚类的线性不变性特性，以及多维尺度分析的应用，用于确定观测值的聚类成员，并解决距离矩阵的 K-均值聚类中的聚类数选择问题。特别是，我们分析了 K-均值聚类在全维尺度配置和与平方距离的适当平移相关的等效分区配置上的性能。进行了一项蒙特卡罗实验，其中比较了所检查的方法与直接适用于距离矩阵的程序的结果。

相似文献

On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling.

Psychometrika. 2021 Jun;86(2):489-513. doi: 10.1007/s11336-021-09757-2. Epub 2021 May 19.

Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data.

Psychometrika. 2017 Jun;82(2):275-294. doi: 10.1007/s11336-017-9561-1. Epub 2017 Feb 13.

A robust alternating least squares K-means clustering approach for times series using dynamic time warping dissimilarities.

Math Biosci Eng. 2024 Feb 6;21(3):3631-3651. doi: 10.3934/mbe.2024160.

Identifying cell types from single-cell data based on similarities and dissimilarities between cells.

BMC Bioinformatics. 2021 May 18;22(Suppl 3):255. doi: 10.1186/s12859-020-03873-z.

Front Genet. 2022 Jul 1;13:912711. doi: 10.3389/fgene.2022.912711. eCollection 2022.

Profiling local optima in K-means clustering: developing a diagnostic technique.

Psychol Methods. 2006 Jun;11(2):178-92. doi: 10.1037/1082-989X.11.2.178.

A unified approach based on multidimensional scaling for calibration estimation in survey sampling with qualitative auxiliary information.

Stat Methods Med Res. 2023 Apr;32(4):760-772. doi: 10.1177/09622802231151211. Epub 2023 Feb 15.

Modified multidimensional scaling approach to analyze financial markets.

Chaos. 2014 Jun;24(2):022102. doi: 10.1063/1.4873523.

The comparison of automated clustering algorithms for resampling representative conformer ensembles with RMSD matrix.

J Cheminform. 2017 Mar 23;9(1):21. doi: 10.1186/s13321-017-0208-0.

Inverse MDS: Inferring Dissimilarity Structure from Multiple Item Arrangements.

Front Psychol. 2012 Jul 25;3:245. doi: 10.3389/fpsyg.2012.00245. eCollection 2012.

引用本文的文献

The impact of neglecting feature scaling in k-means clustering.

PLoS One. 2024 Dec 6;19(12):e0310839. doi: 10.1371/journal.pone.0310839. eCollection 2024.

本文引用的文献

Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data.

Psychometrika. 2017 Jun;82(2):275-294. doi: 10.1007/s11336-017-9561-1. Epub 2017 Feb 13.

Choosing the number of clusters in Κ-means clustering.

Psychol Methods. 2011 Sep;16(3):285-97. doi: 10.1037/a0023346.

A framework for feature selection in clustering.

J Am Stat Assoc. 2010 Jun 1;105(490):713-726. doi: 10.1198/jasa.2010.tm09415.

Sign language recognition by combining statistical DTW and independent classification.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):2040-6. doi: 10.1109/TPAMI.2008.123.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于全多维标度的距离矩阵 K-均值聚类行为研究。

On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献