Topaz Maxim, Davoudi Anahita, Evans Lauren, Sridharan Sridevi, Song Jiyoun, Chae Sena, Barrón Yolanda, Hobensack Mollie, Scharp Danielle, Cato Kenrick, Rossetti Sarah Collins, Kapela Piotr, Xu Zidu, Gupta Pallavi, Zhang Zhihong, Mcdonald Margaret V, Bowles Kathryn H
Columbia University School of Nursing, New York City, NY, USA; Data Science Institute, Columbia University, New York City, NY, USA; Center for Home Care Policy and Research, VNS Health, New York City, NY, USA.
Center for Home Care Policy and Research, VNS Health, New York City, NY, USA.
J Am Med Dir Assoc. 2025 Feb;26(2):105417. doi: 10.1016/j.jamda.2024.105417. Epub 2024 Dec 26.
Home health care (HHC) serves more than 5 million older adults annually in the United States, aiming to prevent unnecessary hospitalizations and emergency department (ED) visits. Despite efforts, up to 25% of patients in HHC experience these adverse events. The underutilization of clinical notes, aggregated data approaches, and potential demographic biases have limited previous HHC risk prediction models. This study aimed to develop a time-series risk model to predict hospitalizations and ED visits in patients in HHC, examine model performance over various prediction windows, identify top predictive variables and map them to data standards, and assess model fairness across demographic subgroups.
A total of 27,222 HHC episodes between 2015 and 2017.
The study used health care process modeling of electronic health records, including clinical notes processed with natural language processing techniques and Medicare claims data. A Light Gradient Boosting Machine algorithm was used to develop the risk prediction model, with performance evaluated using 5-fold cross-validation. Model fairness was assessed across gender, race/ethnicity, and socioeconomic subgroups.
The model achieved high predictive performance, with an F1 score of 0.84 for a 5-day prediction window. Twenty top predictive variables were identified, including novel indicators such as the length of nurse-patient visits and visit frequency. Eighty-five percent of these variables mapped completely to the US Core Data for Interoperability standard. Fairness assessment revealed performance disparities across demographic and socioeconomic groups, with lower model effectiveness for more historically underserved populations.
This study developed a robust time-series risk model for predicting adverse events in patients in HHC, incorporating diverse data types and demonstrating high predictive accuracy. The findings highlight the importance of considering established and novel risk factors in HHC. Importantly, the observed performance disparities across subgroups emphasize the need for fairness adjustments to ensure equitable risk prediction across all patient populations.
美国每年有超过500万老年人接受家庭医疗保健(HHC)服务,旨在预防不必要的住院治疗和急诊就诊。尽管做出了努力,但仍有高达25%的HHC患者经历这些不良事件。临床记录利用不足、汇总数据方法以及潜在的人口统计学偏差限制了以往的HHC风险预测模型。本研究旨在开发一个时间序列风险模型,以预测HHC患者的住院治疗和急诊就诊情况,检查不同预测窗口下模型的性能,识别主要预测变量并将其映射到数据标准,以及评估不同人口亚组间模型的公平性。
2015年至2017年间共有27222例HHC病例。
本研究采用电子健康记录的医疗保健过程建模,包括运用自然语言处理技术处理的临床记录和医疗保险索赔数据。使用轻梯度提升机算法开发风险预测模型,并通过5折交叉验证评估性能。在性别、种族/民族和社会经济亚组中评估模型公平性。
该模型具有较高的预测性能,5天预测窗口的F1评分为0.84。识别出20个主要预测变量,包括诸如护患就诊时长和就诊频率等新指标。其中85%的变量完全映射到美国互操作性核心数据标准。公平性评估揭示了不同人口和社会经济群体之间的性能差异,对于历史上服务不足的人群,模型有效性较低。
本研究开发了一个强大的时间序列风险模型,用于预测HHC患者的不良事件,纳入了多种数据类型并展示了较高的预测准确性。研究结果凸显了在HHC中考虑既定和新风险因素的重要性。重要的是,观察到的亚组间性能差异强调了进行公平性调整的必要性,以确保对所有患者群体进行公平的风险预测。