Ruan Xiaoyang, Fu Sunyang, Jia Heling, Mathis Kellie L, Thiels Cornelius A, Gavin Schaeferle M, Wilson Patrick M, Storlie Curtis B, Liu Hongfang
McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
Department of Artificial Intelligence & Informatics, Mayo Clinic, Rochester, NY, USA.
Commun Med (Lond). 2025 Aug 4;5(1):331. doi: 10.1038/s43856-025-01053-9.
Ileus, a postoperative complication after colorectal surgery, increases morbidity, costs, and hospital stays. Assessing risk of ileus is crucial, especially with the trend towards early discharge. Prior studies assessed risk of ileus with regression models, the role of deep learning remains unexplored.
We evaluated the Gated Recurrent Unit with Decay (GRU-D) for real-time ileus risk assessment in 7349 colorectal surgeries across three Mayo Clinic sites with two Electronic Health Record (EHR) systems. The results were compared with atemporal models on a panel of benchmark metrics.
Here we show that despite extreme data sparsity (e.g., 72.2% of labs, 26.9% of vitals lack measurements within 24 h post-surgery), GRU-D demonstrates improved performance by integrating new measurements and exhibits robust transferability. In brute-force transfer, AUROC decreases by no more than 5%, while multi-source instance transfer yields up to a 2.6% improvement in AUROC and an 86% narrower confidence interval. Although atemporal models perform better at certain pre-surgical time points, their performance fluctuates considerably and generally falls short of GRU-D in post-surgical hours.
GRU-D's dynamic risk assessment capability is crucial in scenarios where clinical follow-up is essential, warranting further research on built-in explainability for clinical integration.
肠梗阻是结直肠手术后的一种术后并发症,会增加发病率、成本和住院时间。评估肠梗阻风险至关重要,尤其是在早期出院趋势下。先前的研究使用回归模型评估肠梗阻风险,深度学习的作用仍未得到探索。
我们在三个梅奥诊所站点使用两个电子健康记录(EHR)系统,对7349例结直肠手术中的门控循环单元衰减(GRU-D)进行实时肠梗阻风险评估。将结果与一组基准指标上的非时间模型进行比较。
我们在此表明,尽管存在极端的数据稀疏性(例如,72.2%的实验室检查、26.9%的生命体征在术后24小时内缺乏测量数据),GRU-D通过整合新测量数据表现出了更好的性能,并具有强大的可转移性。在强力转移中,曲线下面积(AUROC)下降不超过5%,而多源实例转移使AUROC提高了2.6%,置信区间缩小了86%。尽管非时间模型在某些术前时间点表现更好,但其性能波动较大,在术后数小时内总体上不及GRU-D。
GRU-D的动态风险评估能力在临床随访至关重要的情况下至关重要,值得对其用于临床整合的内置可解释性进行进一步研究。