Chair of Health Economics, Technical University of Munich, Georg-Brauchle-Ring, Munich, Bavaria, 80992, Germany.
BMC Med Res Methodol. 2023 Sep 27;23(1):212. doi: 10.1186/s12874-023-02019-y.
Healthcare, as with other sectors, has undergone progressive digitalization, generating an ever-increasing wealth of data that enables research and the analysis of patient movement. This can help to evaluate treatment processes and outcomes, and in turn improve the quality of care. This scoping review provides an overview of the algorithms and methods that have been used to identify care pathways from healthcare utilization data.
This review was conducted according to the methodology of the Joanna Briggs Institute and the Preferred Reporting Items for Systematic Reviews Extension for Scoping Reviews (PRISMA-ScR) Checklist. The PubMed, Web of Science, Scopus, and EconLit databases were searched and studies published in English between 2000 and 2021 considered. The search strategy used keywords divided into three categories: the method of data analysis, the requirement profile for the data, and the intended presentation of results. Criteria for inclusion were that health data were analyzed, the methodology used was described and that the chronology of care events was considered. In a two-stage review process, records were reviewed by two researchers independently for inclusion. Results were synthesized narratively.
The literature search yielded 2,865 entries; 51 studies met the inclusion criteria. Health data from different countries ([Formula: see text]) and of different types of disease ([Formula: see text]) were analyzed with respect to different care events. Applied methods can be divided into those identifying subsequences of care and those describing full care trajectories. Variants of pattern mining or Markov models were mostly used to extract subsequences, with clustering often applied to find care trajectories. Statistical algorithms such as rule mining, probability-based machine learning algorithms or a combination of methods were also applied. Clustering methods were sometimes used for data preparation or result compression. Further characteristics of the included studies are presented.
Various data mining methods are already being applied to gain insight from health data. The great heterogeneity of the methods used shows the need for a scoping review. We performed a narrative review and found that clustering methods currently dominate the literature for identifying complete care trajectories, while variants of pattern mining dominate for identifying subsequences of limited length.
医疗保健行业与其他行业一样,经历了渐进式的数字化,产生了越来越多的数据,这些数据可用于研究和分析患者的流动情况。这有助于评估治疗过程和结果,并相应地提高护理质量。本范围综述概述了用于从医疗保健利用数据中识别护理途径的算法和方法。
本综述根据 Joanna Briggs 研究所的方法和系统评价扩展的首选报告项目(PRISMA-ScR)清单进行。检索了 PubMed、Web of Science、Scopus 和 EconLit 数据库,并考虑了 2000 年至 2021 年期间以英文发表的研究。使用分为三个类别的关键字制定了搜索策略:数据分析方法、数据要求概况和预期结果表示。纳入标准是对健康数据进行了分析、描述了使用的方法以及考虑了护理事件的时间顺序。在两阶段审查过程中,两名研究人员独立审查记录以确定是否纳入。结果以叙述方式进行综合。
文献检索产生了 2865 条记录;51 项研究符合纳入标准。分析了来自不同国家的健康数据 ([Formula: see text]) 和不同类型疾病的健康数据 ([Formula: see text]),以了解不同的护理事件。应用的方法可分为识别护理子序列的方法和描述完整护理轨迹的方法。模式挖掘或马尔可夫模型的变体主要用于提取子序列,聚类通常用于找到护理轨迹。还应用了基于规则挖掘、基于概率的机器学习算法或方法组合等统计算法。聚类方法有时用于数据准备或结果压缩。还介绍了纳入研究的其他特征。
已经应用了各种数据挖掘方法从健康数据中获得见解。所使用方法的巨大异质性表明需要进行范围综述。我们进行了叙述性综述,发现聚类方法目前在识别完整护理轨迹方面占据主导地位,而模式挖掘的变体在识别有限长度的子序列方面占据主导地位。