Division of Research, Kaiser Permanente, Oakland, California, USA.
Department of Epidemiology, University of Washington, Seattle, Washington, USA.
J Am Med Inform Assoc. 2019 Dec 1;26(12):1466-1477. doi: 10.1093/jamia/ocz106.
To use unsupervised topic modeling to evaluate heterogeneity in sepsis treatment patterns contained within granular data of electronic health records.
A multicenter, retrospective cohort study of 29 253 hospitalized adult sepsis patients between 2010 and 2013 in Northern California. We applied an unsupervised machine learning method, Latent Dirichlet Allocation, to the orders, medications, and procedures recorded in the electronic health record within the first 24 hours of each patient's hospitalization to uncover empiric treatment topics across the cohort and to develop computable clinical signatures for each patient based on proportions of these topics. We evaluated how these topics correlated with common sepsis treatment and outcome metrics including inpatient mortality, time to first antibiotic, and fluids given within 24 hours.
Mean age was 70 ± 17 years with hospital mortality of 9.6%. We empirically identified 42 clinically recognizable treatment topics (eg, pneumonia, cellulitis, wound care, shock). Only 43.1% of hospitalizations had a single dominant topic, and a small minority (7.3%) had a single topic comprising at least 80% of their overall clinical signature. Across the entire sepsis cohort, clinical signatures were highly variable.
Heterogeneity in sepsis is a major barrier to improving targeted treatments, yet existing approaches to characterizing clinical heterogeneity are narrowly defined. A machine learning approach captured substantial patient- and population-level heterogeneity in treatment during early sepsis hospitalization.
Using topic modeling based on treatment patterns may enable more precise clinical characterization in sepsis and better understanding of variability in sepsis presentation and outcomes.
利用无监督主题建模来评估电子病历中颗粒数据中包含的脓毒症治疗模式的异质性。
这是一项 2010 年至 2013 年期间在加利福尼亚北部进行的多中心、回顾性队列研究,共纳入 29253 名住院成年脓毒症患者。我们将无监督机器学习方法(潜在狄利克雷分配)应用于每位患者住院的前 24 小时内电子病历中记录的医嘱、药物和程序,以揭示整个队列中的经验性治疗主题,并为每位患者开发基于这些主题比例的可计算临床特征。我们评估了这些主题如何与常见的脓毒症治疗和结局指标相关,包括住院死亡率、首次使用抗生素的时间以及 24 小时内给予的液体量。
平均年龄为 70±17 岁,住院死亡率为 9.6%。我们经验性地确定了 42 个具有临床意义的治疗主题(例如肺炎、蜂窝织炎、伤口护理、休克)。只有 43.1%的住院患者存在单一主导主题,少数(7.3%)患者的单一主题占其整体临床特征的至少 80%。在整个脓毒症队列中,临床特征高度可变。
脓毒症的异质性是改善靶向治疗的主要障碍,但现有的描述临床异质性的方法定义较窄。机器学习方法捕捉了脓毒症住院早期治疗中大量的患者和人群水平的异质性。
基于治疗模式的主题建模可能使脓毒症的临床特征描述更加精确,并更好地理解脓毒症表现和结局的变异性。