Gupta Preeti, Pearce Alex K, Pham Thaidan, Miller Michael, Brunetti Korey, Heskett Karen, Malhotra Atul, Mayampurath Anoop, Afshar Majid
Scripps Research, La Jolla, CA, USA.
University of California San Diego, San Diego, CA, USA.
Intensive Care Med Exp. 2025 Aug 8;13(1):83. doi: 10.1186/s40635-025-00791-3.
Artificial intelligence (AI) has emerged as a promising tool for decision support in managing acute respiratory failure, yet its real-world clinical impact remains unclear. This scoping review identifies clinically validated AI-driven tools in this domain, focusing on the reporting of key evaluation quality measures that are a prerequisite for broader deployment.
Studies were included if they compared a clinical, human factors, or health systems-related outcome of an AI-driven intervention to a control group in adult patients with acute respiratory failure. Studies were excluded if they lacked a machine learning model, compared models trained on the same dataset, assessed only model performance, or evaluated models in simulated settings. A systematic literature search was conducted in PubMed, CINAHL, and EmBase, from inception until January 2025. Each abstract was independently screened by two reviewers. One reviewer extracted data and performed quality assessment, following the DECIDE-AI framework for early-stage clinical evaluation of AI-based decision support systems.
Of 5,987 citations, six studies met eligibility. The studies, conducted between 2012 and 2024 in Taiwan, Italy, and the U.S., included 40-2,536 patients. Four studies (67%) focused on predicting weaning from mechanical ventilation. Three (50%) of the studies demonstrated a statistically significant and clinically meaningful outcome. Studies met a median of 3.5 (IQR: 2.25-6.25) of the 17 DECIDE-AI criteria. None reported AI-related errors, malfunctions, or algorithmic fairness considerations. Only one study (17%) described user characteristics and adherence, while two (33%) assessed human-computer agreement and usability.
Our review identified six studies evaluating AI-driven decision support tools for acute respiratory failure, with most focusing on predicting weaning from mechanical ventilation. However, methodological rigor for early clinical evaluation was inconsistent, with studies meeting few of the DECIDE-AI criteria. Notably, critical aspects such as error reporting, algorithmic fairness, and user adherence were largely unaddressed. Further high-quality assessments of reliability, usability, and real-world implementation are essential to realize the potential of these tools to transform patient care.
人工智能(AI)已成为管理急性呼吸衰竭决策支持的一种有前景的工具,但其在现实世界中的临床影响仍不明确。本范围综述确定了该领域经过临床验证的人工智能驱动工具,重点关注关键评估质量指标的报告,这些指标是更广泛应用的先决条件。
如果研究将人工智能驱动干预的临床、人为因素或卫生系统相关结果与成年急性呼吸衰竭患者的对照组进行比较,则纳入该研究。如果研究缺乏机器学习模型、比较在同一数据集上训练的模型、仅评估模型性能或在模拟环境中评估模型,则排除该研究。在PubMed、CINAHL和EmBase中进行了从创刊到2025年1月的系统文献检索。每篇摘要由两名评审员独立筛选。一名评审员按照基于人工智能的决策支持系统早期临床评估的DECIDE-AI框架提取数据并进行质量评估。
在5987条引用文献中,有6项研究符合纳入标准。这些研究于2012年至2024年在台湾、意大利和美国进行,纳入了40至2536名患者。4项研究(67%)侧重于预测机械通气脱机。3项研究(50%)显示出具有统计学意义和临床意义的结果。研究符合17项DECIDE-AI标准中的中位数为3.5项(四分位间距:2.25 - 6.25)。没有研究报告与人工智能相关的错误、故障或算法公平性考量。只有1项研究(17%)描述了用户特征和依从性,而2项研究(33%)评估了人机一致性和可用性。
我们的综述确定了6项评估人工智能驱动的急性呼吸衰竭决策支持工具的研究,大多数研究侧重于预测机械通气脱机。然而,早期临床评估的方法严谨性不一致,研究符合的DECIDE-AI标准很少。值得注意的是,诸如错误报告、算法公平性和用户依从性等关键方面在很大程度上未得到解决。进一步对可靠性、可用性和实际应用进行高质量评估对于实现这些工具改变患者护理的潜力至关重要。