Holler Emma, Ludema Christina, Ben Miled Zina, Rosenberg Molly, Kalbaugh Corey, Boustani Malaz, Mohanty Sanjay
Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, United States.
Department of Epidemiology & Biostatistics, Indiana University Bloomington, Bloomington, United States.
JMIR Perioper Med. 2025 Jan 9;8:e59422. doi: 10.2196/59422.
Postoperative delirium (POD) is a common complication after major surgery and is associated with poor outcomes in older adults. Early identification of patients at high risk of POD can enable targeted prevention efforts. However, existing POD prediction models require inpatient data collected during the hospital stay, which delays predictions and limits scalability.
This study aimed to develop and externally validate a machine learning-based prediction model for POD using routine electronic health record (EHR) data.
We identified all surgical encounters from 2014 to 2021 for patients aged 50 years and older who underwent an operation requiring general anesthesia, with a length of stay of at least 1 day at 3 Indiana hospitals. Patients with preexisting dementia or mild cognitive impairment were excluded. POD was identified using Confusion Assessment Method records and delirium International Classification of Diseases (ICD) codes. Controls without delirium or nurse-documented confusion were matched to cases by age, sex, race, and year of admission. We trained logistic regression, random forest, extreme gradient boosting (XGB), and neural network models to predict POD using 143 features derived from routine EHR data available at the time of hospital admission. Separate models were developed for each hospital using surveillance periods of 3 months, 6 months, and 1 year before admission. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC). Each model was internally validated using holdout data and externally validated using data from the other 2 hospitals. Calibration was assessed using calibration curves.
The study cohort included 7167 delirium cases and 7167 matched controls. XGB outperformed all other classifiers. AUROCs were highest for XGB models trained on 12 months of preadmission data. The best-performing XGB model achieved a mean AUROC of 0.79 (SD 0.01) on the holdout set, which decreased to 0.69-0.74 (SD 0.02) when externally validated on data from other hospitals.
Our routine EHR-based POD prediction models demonstrated good predictive ability using a limited set of preadmission and surgical variables, though their generalizability was limited. The proposed models could be used as a scalable, automated screening tool to identify patients at high risk of POD at the time of hospital admission.
术后谵妄(POD)是大手术后常见的并发症,与老年人预后不良相关。早期识别POD高危患者有助于开展针对性预防措施。然而,现有的POD预测模型需要住院期间收集的住院患者数据,这会延迟预测并限制其可扩展性。
本研究旨在使用常规电子健康记录(EHR)数据开发并外部验证基于机器学习的POD预测模型。
我们识别了2014年至2021年期间印第安纳州3家医院中50岁及以上接受需要全身麻醉手术且住院时间至少1天的所有手术病例。排除患有痴呆症或轻度认知障碍的患者。使用混乱评估方法记录和谵妄国际疾病分类(ICD)编码来识别POD。无谵妄或护士记录的意识模糊的对照者按年龄、性别、种族和入院年份与病例进行匹配。我们训练了逻辑回归、随机森林、极端梯度提升(XGB)和神经网络模型,以使用入院时常规EHR数据中的143个特征预测POD。使用入院前3个月、6个月和1年的监测期为每家医院开发单独的模型。使用受试者工作特征曲线下面积(AUROC)评估模型性能。每个模型使用留出数据进行内部验证,并使用其他2家医院的数据进行外部验证。使用校准曲线评估校准情况。
研究队列包括7167例谵妄病例和7167例匹配的对照者。XGB的表现优于所有其他分类器。在入院前12个月数据上训练的XGB模型的AUROC最高。表现最佳的XGB模型在留出集上的平均AUROC为0.79(标准差0.01),在其他医院的数据上进行外部验证时降至0.69 - 0.74(标准差0.02)。
我们基于常规EHR的POD预测模型使用有限的入院前和手术变量集显示出良好的预测能力,但其通用性有限。所提出的模型可作为一种可扩展的自动化筛查工具,在入院时识别POD高危患者。