Song Xing, Waitman Lemuel R, Yu Alan Sl, Robbins David C, Hu Yong, Liu Mei
University of Kansas Medical Center, Department of Internal Medicine, Division of Medical Informatics, Kansas City, KS, United States.
University of Kansas Medical Center, Division of Nephrology and Hypertension and the Kidney Institute, Kansas City, KS, United States.
JMIR Med Inform. 2020 Jan 31;8(1):e15510. doi: 10.2196/15510.
Artificial intelligence-enabled electronic health record (EHR) analysis can revolutionize medical practice from the diagnosis and prediction of complex diseases to making recommendations in patient care, especially for chronic conditions such as chronic kidney disease (CKD), which is one of the most frequent complications in patients with diabetes and is associated with substantial morbidity and mortality.
The longitudinal prediction of health outcomes requires effective representation of temporal data in the EHR. In this study, we proposed a novel temporal-enhanced gradient boosting machine (GBM) model that dynamically updates and ensembles learners based on new events in patient timelines to improve the prediction accuracy of CKD among patients with diabetes.
Using a broad spectrum of deidentified EHR data on a retrospective cohort of 14,039 adult patients with type 2 diabetes and GBM as the base learner, we validated our proposed Landmark-Boosting model against three state-of-the-art temporal models for rolling predictions of 1-year CKD risk.
The proposed model uniformly outperformed other models, achieving an area under receiver operating curve of 0.83 (95% CI 0.76-0.85), 0.78 (95% CI 0.75-0.82), and 0.82 (95% CI 0.78-0.86) in predicting CKD risk with automatic accumulation of new data in later years (years 2, 3, and 4 since diabetes mellitus onset, respectively). The Landmark-Boosting model also maintained the best calibration across moderate- and high-risk groups and over time. The experimental results demonstrated that the proposed temporal model can not only accurately predict 1-year CKD risk but also improve performance over time with additionally accumulated data, which is essential for clinical use to improve renal management of patients with diabetes.
Incorporation of temporal information in EHR data can significantly improve predictive model performance and will particularly benefit patients who follow-up with their physicians as recommended.
借助人工智能的电子健康记录(EHR)分析能够彻底改变医疗实践,从复杂疾病的诊断和预测到为患者护理提供建议,特别是对于慢性疾病,如慢性肾脏病(CKD),它是糖尿病患者最常见的并发症之一,且与高发病率和死亡率相关。
健康结局的纵向预测需要在电子健康记录中有效表示时间数据。在本研究中,我们提出了一种新颖的时间增强梯度提升机(GBM)模型,该模型基于患者时间线中的新事件动态更新并整合学习者,以提高糖尿病患者中CKD的预测准确性。
我们使用了来自14039例成年2型糖尿病患者回顾性队列的广泛匿名电子健康记录数据,并以GBM作为基础学习器,针对三种用于滚动预测1年CKD风险的先进时间模型验证了我们提出的地标增强模型。
所提出的模型在预测CKD风险方面始终优于其他模型,在后续年份(分别为糖尿病发病后的第2、3和4年)自动积累新数据时,受试者操作特征曲线下面积分别为0.83(95%CI 0.76 - 0.85)、0.78(95%CI 0.75 - 0.82)和0.82(95%CI 0.78 - 0.86)。地标增强模型在中高风险组以及随时间推移也保持了最佳校准。实验结果表明,所提出的时间模型不仅可以准确预测1年CKD风险,还能随着额外积累的数据随时间提高性能,这对于改善糖尿病患者肾脏管理的临床应用至关重要。
在电子健康记录数据中纳入时间信息可以显著提高预测模型性能,尤其将使按照建议与医生随访的患者受益。