比较潜在类别、K-均值和 K-中位数方法在二分类数据聚类中的应用。

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

机构信息

Department of Analytics, Information Systems, & Supply Chain, Florida State University.

Department of Psychological Sciences, University of Missouri.

出版信息

Psychol Methods. 2017 Sep;22(3):563-580. doi: 10.1037/met0000095. Epub 2016 Sep 8.

DOI:10.1037/met0000095

PMID:27607543

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5982597/

Abstract

The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record

摘要

基于二分类变量对对象进行分类的问题是心理学研究中一个成熟的问题，其应用包括临床诊断、教育测试、认知分类和选择分析。潜在类别分析和 K 均值聚类是基于心理学文献中二分类测量对对象进行分类的常用方法。最近，K-中位数聚类方法被吹捧为一种用于心理学数据的潜在有用工具，并且在变量测量为二分类时，它可能比其近亲 K-均值更可取。我们对潜在类别、K-均值和 K-中位数方法进行了基于模拟的比较，用于对二分类数据进行分区。尽管所有 3 种方法都能够恢复聚类结构，但 K-中位数聚类的性能最佳，其次是潜在类别分析。我们还报告了这 3 种方法在传递推理数据中的应用结果，结果表明，当应用于实际数据时，这 3 种方法可能会表现出明显的差异。

相似文献

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

Psychol Methods. 2017 Sep;22(3):563-580. doi: 10.1037/met0000095. Epub 2016 Sep 8.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Subspace K-means clustering.

Behav Res Methods. 2013 Dec;45(4):1011-23. doi: 10.3758/s13428-013-0329-y.

Automated variable weighting in k-means type clustering.

IEEE Trans Pattern Anal Mach Intell. 2005 May;27(5):657-68. doi: 10.1109/TPAMI.2005.95.

A roadmap of clustering algorithms: finding a match for a biomedical application.

Brief Bioinform. 2009 May;10(3):297-314. doi: 10.1093/bib/bbn058. Epub 2009 Feb 24.

Statistical power for cluster analysis.

BMC Bioinformatics. 2022 May 31;23(1):205. doi: 10.1186/s12859-022-04675-1.

K-balance partitioning: an exact method with applications to generalized structural balance and other psychological contexts.

Psychol Methods. 2010 Jun;15(2):145-57. doi: 10.1037/a0017738.

A mutual neighbor-based clustering method and its medical applications.

Comput Biol Med. 2022 Nov;150:106184. doi: 10.1016/j.compbiomed.2022.106184. Epub 2022 Oct 12.

In simulated data and health records, latent class analysis was the optimum multimorbidity clustering algorithm.

J Clin Epidemiol. 2022 Dec;152:164-175. doi: 10.1016/j.jclinepi.2022.10.011. Epub 2022 Oct 11.

Variable selection for latent class analysis in the presence of missing data with application to record linkage.

Stat Methods Med Res. 2024 Jun;33(6):966-980. doi: 10.1177/09622802241242317. Epub 2024 Apr 9.

引用本文的文献

Trajectory patterns and influencing factors of depression due to child bereavement among older adults in China: a 5-year longitudinal study.

Front Public Health. 2025 May 9;13:1548256. doi: 10.3389/fpubh.2025.1548256. eCollection 2025.

Unveiling diverse clinical symptom patterns and neural activity profiles in major depressive disorder subtypes.

EBioMedicine. 2025 Jun;116:105756. doi: 10.1016/j.ebiom.2025.105756. Epub 2025 May 14.

LACE-UP: An ensemble machine-learning method for health subtype classification on multidimensional binary data.

Proc Natl Acad Sci U S A. 2025 Apr 29;122(17):e2423341122. doi: 10.1073/pnas.2423341122. Epub 2025 Apr 23.

Subgroups of cognitive impairments in schizophrenia characterized by executive function and their morphological features: a latent profile analysis study.

BMC Med. 2025 Jan 8;23(1):13. doi: 10.1186/s12916-024-03835-9.

Beyond the Myths: Brazilian Consumer Perceptions of Functional Food.

Foods. 2024 Dec 22;13(24):4161. doi: 10.3390/foods13244161.

Change in different classes of chronic back pain suspicious of axial spondyloarthritis: a latent transition analysis of the SPACE cohort.

RMD Open. 2024 Sep 30;10(3):e004584. doi: 10.1136/rmdopen-2024-004584.

Clustering Methods in Rheumatic and Musculoskeletal Disease Research: An Educational Guide to Best Research Practices.

J Rheumatol. 2024 Dec 1;51(12):1160-1168. doi: 10.3899/jrheum.2024-0519.

Fear of cancer recurrence and associated factors in family caregivers of patients with hematologic malignancy receiving chemotherapy: A latent profile analysis.

Asia Pac J Oncol Nurs. 2024 Jan 18;11(4):100382. doi: 10.1016/j.apjon.2024.100382. eCollection 2024 Apr.

Construction of a novel lower-extremity peripheral artery disease subtype prediction model using unsupervised machine learning and neutrophil-related biomarkers.

Heliyon. 2024 Jan 6;10(2):e24189. doi: 10.1016/j.heliyon.2024.e24189. eCollection 2024 Jan 30.

Construction of a predictive model for blood transfusion in patients undergoing total hip arthroplasty and identification of clinical heterogeneity.

Sci Rep. 2024 Jan 6;14(1):724. doi: 10.1038/s41598-024-51240-2.

本文引用的文献

A Repetitive Branch-and-Bound Procedure for Minimum Within-Cluster Sums of Squares Partitioning.

Psychometrika. 2006 Jun;71(2):347-363. doi: 10.1007/s11336-004-1218-1. Epub 2017 Feb 11.

Some Statistical Considerations In Clustering With Binary Data.

Multivariate Behav Res. 1976 Apr 1;11(2):175-88. doi: 10.1207/s15327906mbr1102_5.

A New Variable Weighting and Selection Procedure for K-means Cluster Analysis.

Multivariate Behav Res. 2008 Jan-Mar;43(1):77-108. doi: 10.1080/00273170701836695.

An Exact Method for Partitioning Dichotomous Items Within the Framework of the Monotone Homogeneity Model.

Psychometrika. 2015 Dec;80(4):949-67. doi: 10.1007/s11336-015-9459-8. Epub 2015 Apr 8.

Trajectories of overweight and their association with adolescent depressive symptoms.

Health Psychol. 2015 Oct;34(10):1004-12. doi: 10.1037/hea0000201. Epub 2015 Jan 19.

The Internet Gaming Disorder Scale.

Psychol Assess. 2015 Jun;27(2):567-82. doi: 10.1037/pas0000062. Epub 2015 Jan 5.

Heuristic cognitive diagnosis when the Q-matrix is unknown.

Br J Math Stat Psychol. 2015 May;68(2):268-91. doi: 10.1111/bmsp.12044. Epub 2014 Dec 13.

A Latent Class Approach to Examining Forms of Peer Victimization.

J Educ Psychol. 2013 Aug;105(3):839-849. doi: 10.1037/a0032091.

Personality and gambling involvement: a person-centered approach.

Psychol Addict Behav. 2014 Dec;28(4):1198-211. doi: 10.1037/a0037413. Epub 2014 Aug 18.

An inequality for correlations in unidimensional monotone latent variable models for binary variables.

Psychometrika. 2014 Apr;79(2):303-16. doi: 10.1007/s11336-013-9341-5. Epub 2013 Apr 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

比较潜在类别、K-均值和 K-中位数方法在二分类数据聚类中的应用。

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

机构信息

Department of Analytics, Information Systems, & Supply Chain, Florida State University.

Department of Psychological Sciences, University of Missouri.

出版信息

Psychol Methods. 2017 Sep;22(3):563-580. doi: 10.1037/met0000095. Epub 2016 Sep 8.

DOI:10.1037/met0000095

PMID:27607543

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5982597/

Abstract

摘要

比较潜在类别、K-均值和 K-中位数方法在二分类数据聚类中的应用。

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

比较潜在类别、K-均值和 K-中位数方法在二分类数据聚类中的应用。

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

机构信息

出版信息