Department of Public Health Sciences, Queen's University, Kingston, Ontario, Canada.
Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.
Stat Med. 2022 Jan 15;41(1):108-127. doi: 10.1002/sim.9225. Epub 2021 Oct 20.
In clinical and epidemiological studies, there is a growing interest in studying the heterogeneity among patients based on longitudinal characteristics to identify subtypes of the study population. Compared to clustering a single longitudinal marker, simultaneously clustering multiple longitudinal markers allow additional information to be incorporated into the clustering process, which reveals co-existing longitudinal patterns and generates deeper biological insight. In the current study, we propose a Bayesian consensus clustering (BCC) model for multivariate longitudinal data. Instead of arriving at a single overall clustering, the proposed model allows each marker to follow marker-specific local clustering and these local clusterings are aggregated to find a global (consensus) clustering. To estimate the posterior distribution of model parameters, a Gibbs sampling algorithm is proposed. We apply our proposed model to the primary biliary cirrhosis study to identify patient subtypes that may be associated with their prognosis. We also perform simulation studies to compare the clustering performance between the proposed model and existing models under several scenarios. The results demonstrate that the proposed BCC model serves as a useful tool for clustering multivariate longitudinal data.
在临床和流行病学研究中,人们越来越感兴趣的是根据纵向特征来研究患者的异质性,以确定研究人群的亚组。与聚类单个纵向标志物相比,同时聚类多个纵向标志物可以将更多的信息纳入聚类过程,从而揭示出共存的纵向模式并产生更深入的生物学见解。在本研究中,我们提出了一种用于多变量纵向数据的贝叶斯共识聚类(BCC)模型。与单一的整体聚类不同,所提出的模型允许每个标志物遵循特定于标志物的局部聚类,并且这些局部聚类被聚合以找到全局(共识)聚类。为了估计模型参数的后验分布,提出了一种 Gibbs 抽样算法。我们将我们提出的模型应用于原发性胆汁性肝硬化研究中,以识别可能与预后相关的患者亚型。我们还进行了模拟研究,以比较在几种情况下,所提出的模型和现有模型之间的聚类性能。结果表明,所提出的 BCC 模型是一种用于聚类多变量纵向数据的有用工具。