Suppr超能文献

层次聚类算法的混合模型检验:对每个人进行分类的问题。

Mixture Model Tests Of Hierarchical Clustering Algorithms: The Problem Of Classifying Everybody.

出版信息

Multivariate Behav Res. 1979 Jul 1;14(3):367-84. doi: 10.1207/s15327906mbr1403_6.

Abstract

Due to the effects of outliers, mixture model tests that require all objects to be classified can severely underestimate the accuracy of hierarchical clustering algorithms. More valid and relevant comparisons between algorithms can be made by calculating accuracy at several levels in the hierarchical tree and considering accuracy as a function of the coverage of the classification. Using this procedure, several algorithms were compared on their ability to resolve ten multivariate normal mixtures. All of the algorithms were significantly more accurate than a random linkage algorithm, and accuracy was inversely related to coverage. Algorithms using correlation as the similarity measure were significantly more accurate than those using Euclidean distance (p < .001). A subset of high accuracy algorithms, including single, average, and centroid linkage using correlation, and Ward's minimum variance technique, was identified.

摘要

由于异常值的影响,要求所有对象都被分类的混合模型测试可能会严重低估层次聚类算法的准确性。通过在层次树的几个级别计算准确性,并将准确性视为分类的覆盖范围的函数,可以对算法进行更有效和更相关的比较。使用此过程,在其解析十个多元正态混合的能力上对几种算法进行了比较。所有算法的准确性都明显高于随机链接算法,并且准确性与覆盖范围成反比。使用相关性作为相似性度量的算法比使用欧几里得距离的算法(p<.001)更为准确。确定了一组高精度算法,包括使用相关性的单链接、平均链接和质心链接,以及 Ward 的最小方差技术。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验