Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore.
Program in Health Services and Systems Research, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore.
BMC Med Res Methodol. 2018 Nov 3;18(1):121. doi: 10.1186/s12874-018-0584-9.
Data-driven population segmentation analysis utilizes data analytics to divide a heterogeneous population into parsimonious and relatively homogenous groups with similar healthcare characteristics. It is a promising patient-centric analysis that enables effective integrated healthcare interventions specific for each segment. Although widely applied, there is no systematic review on the clinical application of data-driven population segmentation analysis.
We carried out a systematic literature search using PubMed, Embase and Web of Science following PRISMA criteria. We included English peer-reviewed articles that applied data-driven population segmentation analysis on empirical health data. We summarized the clinical settings in which segmentation analysis was applied, compared and contrasted strengths, limitations, and practical considerations of different segmentation methods, and assessed the segmentation outcome of all included studies. The studies were assessed by two independent reviewers.
We retrieved 14,514 articles and included 216 articles. Data-driven population segmentation analysis was widely used in different clinical contexts. 163 studies examined the general population while 53 focused on specific population with certain diseases or conditions, including psychological, oncological, respiratory, cardiovascular, and gastrointestinal conditions. Variables used for segmentation in the studies are heterogeneous. Most studies (n = 170) utilized secondary data in community settings (n = 185). The most common segmentation method was latent class/profile/transition/growth analysis (n = 96) followed by K-means cluster analysis (n = 60) and hierarchical analysis (n = 50), each having its advantages, disadvantages, and practical considerations. We also identified key criteria to evaluate a segmentation framework: internal validity, external validity, identifiability/interpretability, substantiality, stability, actionability/accessibility, and parsimony.
Data-driven population segmentation has been widely applied and holds great potential in managing population health. The evaluations of segmentation outcome require the interplay of data analytics and subject matter expertise. The optimal framework for segmentation requires further research.
数据驱动的人群细分分析利用数据分析将异质人群划分为具有相似医疗特征的简约且相对同质的组。这是一种有前途的以患者为中心的分析方法,可针对每个细分群体实施有效的综合医疗干预措施。尽管该方法得到了广泛应用,但目前尚无关于数据驱动的人群细分分析的临床应用的系统评价。
我们按照 PRISMA 标准,使用 PubMed、Embase 和 Web of Science 进行了系统文献检索。我们纳入了应用实证健康数据进行数据驱动的人群细分分析的英文同行评审文章。我们总结了细分分析应用的临床环境,比较和对比了不同细分方法的优缺点和实际考虑因素,并评估了所有纳入研究的细分结果。这些研究由两名独立的审查员进行评估。
我们检索到 14514 篇文章,纳入了 216 篇文章。数据驱动的人群细分分析在不同的临床环境中得到了广泛应用。163 项研究考察了一般人群,而 53 项研究则针对特定人群,包括心理、肿瘤、呼吸、心血管和胃肠道疾病等。研究中用于细分的变量具有异质性。大多数研究(n=170)利用社区环境中的二级数据(n=185)。最常见的细分方法是潜在类别/特征/过渡/增长分析(n=96),其次是 K-均值聚类分析(n=60)和层次分析(n=50),每种方法都有其优缺点和实际考虑因素。我们还确定了评估细分框架的关键标准:内部有效性、外部有效性、可识别性/可解释性、实质性、稳定性、可操作性/可及性和简约性。
数据驱动的人群细分已经得到了广泛应用,并在管理人群健康方面具有巨大的潜力。细分结果的评估需要数据分析和主题专业知识的相互作用。最佳的细分框架需要进一步研究。