Jung Jinsun, Kang Sunghoon, Choi Jeeyae, El-Kareh Robert, Lee Hyungbok, Kim Hyeoneui
College of Nursing, Seoul National University, Seoul, Republic of Korea; Center for Human-Caring Nurse Leaders for the Future by Brain Korea 21 (BK 21) Four Project, College of Nursing, Seoul National University, Seoul, Republic of Korea.
The Department of Science Studies, Seoul National University, Seoul, Republic of Korea.
Int J Med Inform. 2025 Sep;201:105943. doi: 10.1016/j.ijmedinf.2025.105943. Epub 2025 Apr 21.
Explainable Artificial Intelligence (XAI) is increasingly vital in healthcare, where clinicians need to understand and trust AI-generated recommendations. However, the impact of AI model explanations on clinical decision-making remains insufficiently explored.
To evaluate how AI model explanations influence clinicians' mental models, trust, and satisfaction regarding machine learning-based predictions of Intensive Care Unit (ICU) Length of Stay (LOS).
This retrospective mixed-methods study analyzed electronic health record data from 8,579 patients admitted to a surgical ICU in South Korea between 2019 and 2022. Seven machine learning models were developed and evaluated to predict ICU LOS at 2-hour intervals during the initial 12 hours post-admission. The Random Forest (RF) model in the 10- to 12-hour window, with an AUROC of 0.903, was selected for explanation using SHapley Additive exPlanations. Fifteen ICU clinicians assessed four distinct types of explanations ('Why', 'Why not', 'How to', and 'What if') via web-based experiments, surveys, and interviews.
Clinicians' feature selections aligned more closely with the RF model after explanations, as demonstrated by an increase in Spearman correlation from -0.147 (p = 0.275) to 0.868 (p < 0.001). The average trust score improved from 2.8 to 3.9. The average satisfaction scores for the 'Why', 'Why not', 'How to', and 'What if' explanations were 3.3, 3.8, 3.6, and 4.1, respectively.
AI model explanations notably enhanced clinicians' understanding and trust in AI-generated ICU LOS predictions, although complete alignment with their mental models was not achieved. Further refinement of AI model explanations is needed to support better clinician-AI collaboration and its integration into clinical practice.
可解释人工智能(XAI)在医疗保健领域日益重要,临床医生需要理解并信任人工智能生成的建议。然而,人工智能模型解释对临床决策的影响仍未得到充分探索。
评估人工智能模型解释如何影响临床医生对基于机器学习的重症监护病房(ICU)住院时长(LOS)预测的心理模型、信任度和满意度。
这项回顾性混合方法研究分析了2019年至2022年期间韩国一家外科ICU收治的8579例患者的电子健康记录数据。开发并评估了7种机器学习模型,以预测入院后最初12小时内每隔2小时的ICU住院时长。选择在10至12小时窗口内AUROC为0.903的随机森林(RF)模型,使用SHapley加性解释进行解释。15名ICU临床医生通过基于网络的实验、调查和访谈,评估了四种不同类型的解释(“为什么”、“为什么不”、“如何做”和“如果……会怎样”)。
解释后,临床医生的特征选择与RF模型的一致性更高,斯皮尔曼相关性从-0.147(p = 0.275)提高到0.868(p < 0.001)。平均信任得分从2.8提高到3.9。“为什么”、“为什么不”、“如何做”和“如果……会怎样”解释的平均满意度得分分别为3.3、3.8、3.6和4.1。
人工智能模型解释显著增强了临床医生对人工智能生成的ICU住院时长预测的理解和信任,尽管未完全与他们的心理模型一致。需要进一步完善人工智能模型解释,以支持更好的临床医生与人工智能协作,并将其融入临床实践。