Groningen Research Institute of Pharmacy, Faculty of Science and Engineering, University of Groningen, Groningen, the Netherlands.
Division of Population Health and Genomics, Ninewells Hospital and School of Medicine, University of Dundee, Dundee, UK.
Diabetologia. 2024 Jul;67(7):1343-1355. doi: 10.1007/s00125-024-06147-y. Epub 2024 Apr 16.
AIMS/HYPOTHESIS: This study aimed to explore the added value of subgroups that categorise individuals with type 2 diabetes by k-means clustering for two primary care registries (the Netherlands and Scotland), inspired by Ahlqvist's novel diabetes subgroups and previously analysed by Slieker et al. METHODS: We used two Dutch and Scottish diabetes cohorts (N=3054 and 6145; median follow-up=11.2 and 12.3 years, respectively) and defined five subgroups by k-means clustering with age at baseline, BMI, HbA, HDL-cholesterol and C-peptide. We investigated differences between subgroups by trajectories of risk factor values (random intercept models), time to diabetes-related complications (logrank tests and Cox models) and medication patterns (multinomial logistic models). We also compared directly using the clustering indicators as predictors of progression vs the k-means discrete subgroups. Cluster consistency over follow-up was assessed.
Subgroups' risk factors were significantly different, and these differences remained generally consistent over follow-up. Among all subgroups, individuals with severe insulin resistance faced a significantly higher risk of myocardial infarction both before (HR 1.65; 95% CI 1.40, 1.94) and after adjusting for age effect (HR 1.72; 95% CI 1.46, 2.02) compared with mild diabetes with high HDL-cholesterol. Individuals with severe insulin-deficient diabetes were most intensively treated, with more than 25% prescribed insulin at 10 years of diagnosis. For severe insulin-deficient diabetes relative to mild diabetes, the relative risks for using insulin relative to no common treatment would be expected to increase by a factor of 3.07 (95% CI 2.73, 3.44), holding other factors constant. Clustering indicators were better predictors of progression variation relative to subgroups, but prediction accuracy may improve after combining both. Clusters were consistent over 8 years with an accuracy ranging from 59% to 72%.
CONCLUSIONS/INTERPRETATION: Data-driven subgroup allocations were generally consistent over follow-up and captured significant differences in risk factor trajectories, medication patterns and complication risks. Subgroups serve better as a complement rather than as a basis for compressing clustering indicators.
目的/假设:本研究旨在通过 K 均值聚类方法对来自荷兰和苏格兰的两个初级保健注册研究(荷兰和苏格兰)的 2 型糖尿病患者进行亚组分类,探索其对个体的附加价值。该方法受到 Ahlqvist 提出的新型糖尿病亚组的启发,并由 Slieker 等人进行了分析。
我们使用了荷兰和苏格兰的两个糖尿病队列(N=3054 和 6145;中位随访时间分别为 11.2 年和 12.3 年),通过 K 均值聚类方法,根据年龄、BMI、HbA1c、HDL-胆固醇和 C 肽将 5 个亚组分类。我们通过风险因素值的轨迹(随机截距模型)、糖尿病相关并发症的时间(对数秩检验和 Cox 模型)和药物使用模式(多项逻辑回归模型)来研究亚组之间的差异。我们还直接使用聚类指标作为进展的预测因子与 K 均值离散亚组进行了比较。评估了亚组随时间推移的一致性。
亚组的风险因素存在显著差异,且这些差异在随访过程中基本保持一致。在所有亚组中,与高 HDL-胆固醇的轻度糖尿病相比,严重胰岛素抵抗患者的心肌梗死风险显著增加,包括在调整年龄影响之前(HR 1.65;95%CI 1.40,1.94)和之后(HR 1.72;95%CI 1.46,2.02)。严重胰岛素缺乏性糖尿病患者的治疗最为密集,在诊断后 10 年内有超过 25%的患者使用胰岛素。与轻度糖尿病相比,严重胰岛素缺乏性糖尿病患者使用胰岛素相对于常规治疗的相对风险预计会增加 3.07 倍(95%CI 2.73,3.44),其他因素保持不变。聚类指标在预测进展变化方面优于亚组,但结合两者后预测准确性可能会提高。聚类在 8 年内具有较高的一致性,准确率在 59%至 72%之间。
结论/解释:基于数据驱动的亚组分配在随访过程中基本保持一致,并能捕捉到风险因素轨迹、药物使用模式和并发症风险的显著差异。亚组的作用更多的是作为补充,而不是作为聚类指标压缩的基础。