Srivilaithon Winchana, Thanasarnpaiboon Pichamon
Department of Emergency Medicine, Faculty of Medicine, Thammasat University, Pathum Thani, Thailand.
BMC Emerg Med. 2025 Feb 21;25(1):28. doi: 10.1186/s12873-025-01185-0.
Emergency endotracheal intubation is a critical skill for managing airway emergencies in the emergency department (ED). Accurate prediction of difficult laryngoscopy is essential for improving first-attempt success, minimizing complications, optimizing resource utilization, and enhancing patient outcomes. Traditional methods, such as the LEMON criteria, have limited predictive accuracy. Machine learning (ML) offers advanced predictive capabilities by analyzing large datasets and identifying complex variable interactions. This study aimed to develop and validate the performance of ML models for predicting difficult laryngoscopy in the ED, comparing it with a conventional regression model.
A retrospective cohort study was conducted on 4,370 adult patients who underwent intubation in the ED at Thammasat University Hospital. Difficult laryngoscopy was defined as a Cormack-Lehane grade III or IV. Patients were divided into development (training, 70%) and validation (testing, 30%) cohorts. Predictors of difficult laryngoscopy were identified using multivariable stepwise backward elimination logistic regression and were used to develop ML models, including Logistic Regression, Decision Tree, Random Forest, and XGBoost. Model performance was evaluated using the area under the receiver operating characteristic curve (AuROC), accuracy, precision, recall, and F1-score. Validation was performed on the validation cohort to confirm model accuracy.
Nine significant predictors were identified: male sex, trauma, absence of neuromuscular blocking agents, large incisors, large tongue, limited mouth opening, short thyrohyoid distance, obstructed airway, and poor neck mobility. The Random Forest model demonstrated the highest predictive performance, with an AuROC of 0.82 (95% CI: 0.78-0.85), accuracy of 0.89, recall of 0.89, and F1-score of 0.87, outperforming conventional regression (AuROC 0.76, 95% CI: 0.73-0.78) and other ML models. DeLong's test confirmed a statistically significant difference in AuROC between the two models (p = 0.002). The Decision Tree showed limited performance due to overfitting, while XGBoost demonstrated strong precision. No significant differences were found when comparing the two models with conventional regression (p = 0.498 and 0.496, respectively).
The Random Forest model provides the most robust prediction of difficult laryngoscopy, outperforming both conventional and other ML methods. While ML models improve predictive accuracy, logistic regression remains a practical option in resource-limited settings. Integrating ML into clinical workflows could enhance decision-making, resource allocation, and patient safety in emergency airway management. Future research should prioritize external validation and real-world implementation.
紧急气管插管是急诊科处理气道紧急情况的一项关键技能。准确预测困难喉镜检查对于提高首次尝试成功率、减少并发症、优化资源利用以及改善患者预后至关重要。传统方法,如LEMON标准,预测准确性有限。机器学习(ML)通过分析大型数据集并识别复杂的变量相互作用,提供了先进的预测能力。本研究旨在开发和验证用于预测急诊科困难喉镜检查的ML模型的性能,并将其与传统回归模型进行比较。
对泰国法政大学医院急诊科4370例接受插管的成年患者进行了一项回顾性队列研究。困难喉镜检查定义为Cormack-Lehane分级III级或IV级。患者被分为开发(训练,70%)和验证(测试,30%)队列。使用多变量逐步向后消除逻辑回归确定困难喉镜检查的预测因素,并用于开发ML模型,包括逻辑回归、决策树、随机森林和XGBoost。使用受试者工作特征曲线下面积(AuROC)、准确性、精确性、召回率和F1分数评估模型性能。在验证队列上进行验证以确认模型准确性。
确定了九个显著预测因素:男性、创伤、未使用神经肌肉阻滞剂、门牙大、舌头大、张口受限、甲状舌骨距离短、气道阻塞和颈部活动度差。随机森林模型表现出最高的预测性能,AuROC为0.82(95%CI:0.78 - 0.85),准确性为0.89,召回率为0.89,F1分数为0.87,优于传统回归(AuROC 0.76,95%CI:0.73 - 0.78)和其他ML模型。DeLong检验证实两个模型之间的AuROC存在统计学显著差异(p = 0.002)。决策树由于过度拟合表现有限,而XGBoost表现出很强的精确性。将这两个模型与传统回归进行比较时未发现显著差异(p分别为0.498和0.496)。
随机森林模型对困难喉镜检查提供了最可靠的预测,优于传统方法和其他ML方法。虽然ML模型提高了预测准确性,但在资源有限的环境中逻辑回归仍然是一个实用的选择。将ML整合到临床工作流程中可以加强急诊气道管理中的决策制定、资源分配和患者安全。未来的研究应优先进行外部验证和实际应用。