Suppr超能文献

聚类MLD:一种用于多变量纵向数据的高效层次聚类方法。

clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data.

作者信息

Zhou Junyi, Zhang Ying, Tu Wanzhu

机构信息

Department of Biostatistics and Health Data Science, Indiana University.

Department of Biostatistics, University of Nebraska Medical Center.

出版信息

J Comput Graph Stat. 2023;32(3):1131-1144. doi: 10.1080/10618600.2022.2149540. Epub 2023 Jan 12.

Abstract

Longitudinal data clustering is challenging because the grouping has to account for the similarity of individual trajectories in the presence of sparse and irregular times of observation. This paper puts forward a hierarchical agglomerative clustering method based on a dissimilarity metric that quantifies the cost of merging two distinct groups of curves, which are depicted by -splines for the repeatedly measured data. Extensive simulations show that the proposed method has superior performance in determining the number of clusters, classifying individuals into the correct clusters, and in computational efficiency. Importantly, the method is not only suitable for clustering multivariate longitudinal data with sparse and irregular measurements but also for intensely measured functional data. Towards this end, we provide an R package for the implementation of such analyses. To illustrate the use of the proposed clustering method, two large clinical data sets from real-world clinical studies are analyzed.

摘要

纵向数据聚类具有挑战性,因为在观测时间稀疏且不规则的情况下进行分组时,必须考虑个体轨迹的相似性。本文提出了一种基于差异度量的层次凝聚聚类方法,该差异度量量化了合并两组不同曲线的成本,对于重复测量的数据,这些曲线由样条表示。大量模拟表明,该方法在确定聚类数量、将个体正确分类到聚类中以及计算效率方面具有卓越性能。重要的是,该方法不仅适用于对具有稀疏和不规则测量的多变量纵向数据进行聚类,也适用于密集测量的函数型数据。为此,我们提供了一个用于实现此类分析的R包。为了说明所提出聚类方法的使用,我们分析了来自真实世界临床研究的两个大型临床数据集。

相似文献

2
3
Resolving the structure of interactomes with hierarchical agglomerative clustering.利用层次凝聚聚类解析互作组学结构。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S44. doi: 10.1186/1471-2105-12-S1-S44.
5
Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering.基于凝聚层次聚类的最佳聚类数确定方法。
IEEE Trans Neural Netw Learn Syst. 2017 Dec;28(12):3007-3017. doi: 10.1109/TNNLS.2016.2608001. Epub 2016 Oct 5.

本文引用的文献

3
A Randomized Trial of Intensive versus Standard Blood-Pressure Control.强化与标准血压控制的随机试验
N Engl J Med. 2015 Nov 26;373(22):2103-16. doi: 10.1056/NEJMoa1511939. Epub 2015 Nov 9.
7
Wavelet-based clustering for mixed-effects functional models in high dimension.基于小波的高维混合效应功能模型聚类
Biometrics. 2013 Mar;69(1):31-40. doi: 10.1111/j.1541-0420.2012.01828.x. Epub 2013 Feb 4.
8
Cognitive domains that predict time to diagnosis in prodromal Huntington disease.预测亨廷顿病前驱期诊断时间的认知领域。
J Neurol Neurosurg Psychiatry. 2012 Jun;83(6):612-9. doi: 10.1136/jnnp-2011-301732. Epub 2012 Mar 26.
10
KmL: a package to cluster longitudinal data.KmL:用于聚类纵向数据的软件包。
Comput Methods Programs Biomed. 2011 Dec;104(3):e112-21. doi: 10.1016/j.cmpb.2011.05.008. Epub 2011 Jun 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验