Cwm Taf Morgannwg University Health Board, Ynysmeurig House, Navigation Park, Abercynon, CF45 4SN, UK.
Division of Population Medicine, School of Medicine, Cardiff University, Cardiff, UK.
BMC Public Health. 2024 Jun 18;24(1):1621. doi: 10.1186/s12889-024-19065-w.
In recent years data-driven population segmentation using cluster analyses of mainly health care utilisation data has been used as a proxy of future health care need. Chronic conditions patterns tended to be examined after segmentation but may be useful as a segmentation variable which, in combination with utilisation could indicate severity. These could further be of practical use to target specific clinical groups including for prevention. This study aimed to assess the ability of data-driven segmentation based on health care utilisation and comorbidities to predict future outcomes: Emergency admission, A&E attendance, GP practice contacts, and mortality.
We analysed record-linked data for 412,997 patients registered with GP practices in 2018-19 in Cwm Taf Morgannwg University Health Board (CTM UHB) area within the Secure Anonymised Information Linkage (SAIL) Databank. We created 10 segments using k-means clustering based on utilisation (GP practice contacts, prescriptions, emergency and elective admissions, A&E and outpatients) and chronic condition counts for 2018 using different variable compositions to denote need. We assessed the characteristics of the segments. We employed a train/test scheme (80% training set) to compare logistic regression model predictions with observed outcomes on follow-up in 2019. We assessed the area under the ROC curve (AUC) for models with demographic variables, with and without the segments, as well as between segmentation implementations (with/without comorbidity and primary care data).
Adding the segments to the model with demographic covariates improved the prediction for all outcomes. For emergency admissions this increased discrimination from AUC 0.65 (CI 0.64-0.65) to 0.73 (CI 0.73-0.74). Models with the segments only performed nearly as well as the full models. Excluding comorbidity showed reduced predictive ability for mortality (similar otherwise) but most pronounced reduction when excluding all primary care variables.
This shows that the segments have satisfactory predictive ability, even for varied outcomes and a broad range of events and conditions used in the segmentation. It suggests that the segments can be a useful tool in helping to identify specific groups of need to target with anticipatory care. Identification may be refined with selected diagnoses or more specialised tools such as risk stratification.
近年来,利用主要医疗保健利用数据的聚类分析进行数据驱动的人群细分已被用作未来医疗保健需求的替代指标。在细分后,慢性病模式往往会被检查,但作为细分变量可能很有用,与利用相结合,可以表明严重程度。这些对于针对特定临床群体(包括预防)具有实际意义。本研究旨在评估基于医疗保健利用和合并症的数据驱动细分来预测未来结果的能力:急诊入院、急症就诊、全科医生就诊次数和死亡率。
我们分析了 2018-19 年在 Cwm Taf Morgannwg 大学健康委员会(CTM UHB)区域内安全匿名信息链接(SAIL)数据库中注册的全科医生就诊的 412997 名患者的记录链接数据。我们使用基于利用的 k-均值聚类(全科医生就诊次数、处方、急诊和选择性入院、急症和门诊)和 2018 年的慢性病计数为 10 个段创建了不同的变量组合来表示需求。我们评估了片段的特征。我们采用了一个训练/测试方案(80%的训练集),比较了 2019 年随访时的逻辑回归模型预测与观察结果。我们评估了包含人口统计学变量的模型的 ROC 曲线下面积(AUC),以及包含和不包含片段的模型,以及片段实现之间的 AUC(包含/不包含合并症和初级保健数据)。
在包含人口统计学协变量的模型中添加片段可提高所有结果的预测能力。对于急诊入院,从 AUC 0.65(CI 0.64-0.65)提高到 0.73(CI 0.73-0.74)。仅使用片段的模型表现几乎与完整模型一样好。排除合并症会降低死亡率的预测能力(否则类似),但当排除所有初级保健变量时,预测能力的降低最为明显。
这表明这些片段具有令人满意的预测能力,即使对于不同的结果和广泛的事件和条件用于细分。这表明,这些片段可以作为帮助确定需要针对性护理的特定需求群体的有用工具。可以通过选择诊断或更专业的工具(如风险分层)来细化识别。