Nanini Santino, Abid Mariem, Mamouni Yassir, Wiedemann Arnaud, Jouvet Philippe, Bourassa Stephane
Clinical Decision Support System Articificial Intelligence Health Cluster in Acute Child Care, PE-DIATRICS, CHU Ste-Justine Centre Hospitalier Universitaire Mère-Enfant, 3175 Boulevard de la Côte-Sainte-Catherine Drive, Montréal, QC H3T 1C5, Canada.
Solutions Applicare AI Inc., Montreal, QC H7L 4W3, Canada.
Diagnostics (Basel). 2024 Dec 8;14(23):2763. doi: 10.3390/diagnostics14232763.
BACKGROUND/OBJECTIVES: This study develops machine learning (ML) models to predict hypoxemia severity during emergency triage, particularly in Chemical, Biological, Radiological, Nuclear, and Explosive (CBRNE) scenarios, using physiological data from medical-grade sensors.
Tree-based models (TBMs) such as XGBoost, LightGBM, CatBoost, Random Forests (RFs), Voting Classifier ensembles, and sequential models (LSTM, GRU) were trained on the MIMIC-III and IV datasets. A preprocessing pipeline addressed missing data, class imbalances, and synthetic data flagged with masks. Models were evaluated using a 5-min prediction window with minute-level interpolations for timely interventions.
TBMs outperformed sequential models in speed, interpretability, and reliability, making them better suited for real-time decision-making. Feature importance analysis identified six key physiological variables from the enhanced NEWS2+ score and emphasized the value of mask and score features for transparency. Voting Classifier ensembles showed slight metric gains but did not outperform individually optimized models, facing a precision-sensitivity tradeoff and slightly lower F1-scores for key severity levels.
TBMs were effective for real-time hypoxemia prediction, while sequential models, though better at temporal handling, were computationally costly. This study highlights ML's potential to improve triage systems and reduce alarm fatigue, with future plans to incorporate multi-hospital datasets for broader applicability.
背景/目的:本研究开发机器学习(ML)模型,以利用医疗级传感器的生理数据预测紧急分诊期间的低氧血症严重程度,特别是在化学、生物、放射、核和爆炸物(CBRNE)场景中。
基于树的模型(TBM),如XGBoost、LightGBM、CatBoost、随机森林(RF)、投票分类器集成模型,以及序列模型(LSTM、GRU)在MIMIC-III和IV数据集上进行训练。一个预处理管道解决了缺失数据、类不平衡和用掩码标记的合成数据问题。使用5分钟预测窗口和分钟级插值对模型进行评估,以便及时进行干预。
TBM在速度、可解释性和可靠性方面优于序列模型,使其更适合实时决策。特征重要性分析从增强的NEWS2+评分中确定了六个关键生理变量,并强调了掩码和评分特征对透明度的价值。投票分类器集成模型显示出轻微的指标提升,但没有超过单独优化的模型,面临着精度-敏感性权衡,并且在关键严重程度水平上F1分数略低。
TBM对实时低氧血症预测有效,而序列模型虽然在时间处理方面更好,但计算成本较高。本研究强调了ML在改进分诊系统和减少警报疲劳方面的潜力,未来计划纳入多医院数据集以实现更广泛的适用性。