Unidad Monterey, University of Granada, Granada, Spain.
Centro De Investigación En Matemáticas, Unidad Monterrey, Mexico.
Psychometrika. 2021 Jun;86(2):489-513. doi: 10.1007/s11336-021-09757-2. Epub 2021 May 19.
In this article, we analyse the usefulness of multidimensional scaling in relation to performing K-means clustering on a dissimilarity matrix, when the dimensionality of the objects is unknown. In this situation, traditional algorithms cannot be used, and so K-means clustering procedures are being performed directly on the basis of the observed dissimilarity matrix. Furthermore, the application of criteria originally formulated for two-mode data sets to determine the number of clusters depends on their possible reformulation in a one-mode situation. The linear invariance property in K-means clustering for squared dissimilarities, together with the use of multidimensional scaling, is investigated to determine the cluster membership of the observations and to address the problem of selecting the number of clusters in K-means for a dissimilarity matrix. In particular, we analyse the performance of K-means clustering on the full dimensional scaling configuration and on the equivalently partitioned configuration related to a suitable translation of the squared dissimilarities. A Monte Carlo experiment is conducted in which the methodology examined is compared with the results obtained by procedures directly applicable to a dissimilarity matrix.
在本文中,我们分析了多维尺度分析在多维未知的情况下与对距离矩阵执行 K-均值聚类的相关性。在这种情况下,不能使用传统算法,因此直接在观察到的距离矩阵的基础上执行 K-均值聚类程序。此外,最初为双模数据集制定的标准应用于确定聚类数,取决于它们在单模情况下的可能重新制定。对平方距离的 K-均值聚类的线性不变性特性,以及多维尺度分析的应用,用于确定观测值的聚类成员,并解决距离矩阵的 K-均值聚类中的聚类数选择问题。特别是,我们分析了 K-均值聚类在全维尺度配置和与平方距离的适当平移相关的等效分区配置上的性能。进行了一项蒙特卡罗实验,其中比较了所检查的方法与直接适用于距离矩阵的程序的结果。