评估可解释人工智能对临床医生决策的影响：一项关于重症监护病房住院时长预测的研究。

Evaluating the impact of explainable AI on clinicians' decision-making: A study on ICU length of stay prediction.

作者信息

Jung Jinsun, Kang Sunghoon, Choi Jeeyae, El-Kareh Robert, Lee Hyungbok, Kim Hyeoneui

机构信息

College of Nursing, Seoul National University, Seoul, Republic of Korea; Center for Human-Caring Nurse Leaders for the Future by Brain Korea 21 (BK 21) Four Project, College of Nursing, Seoul National University, Seoul, Republic of Korea.

The Department of Science Studies, Seoul National University, Seoul, Republic of Korea.

出版信息

Int J Med Inform. 2025 Sep;201:105943. doi: 10.1016/j.ijmedinf.2025.105943. Epub 2025 Apr 21.

DOI:10.1016/j.ijmedinf.2025.105943

PMID:40318498

Abstract

BACKGROUND

Explainable Artificial Intelligence (XAI) is increasingly vital in healthcare, where clinicians need to understand and trust AI-generated recommendations. However, the impact of AI model explanations on clinical decision-making remains insufficiently explored.

OBJECTIVES

To evaluate how AI model explanations influence clinicians' mental models, trust, and satisfaction regarding machine learning-based predictions of Intensive Care Unit (ICU) Length of Stay (LOS).

METHODS

This retrospective mixed-methods study analyzed electronic health record data from 8,579 patients admitted to a surgical ICU in South Korea between 2019 and 2022. Seven machine learning models were developed and evaluated to predict ICU LOS at 2-hour intervals during the initial 12 hours post-admission. The Random Forest (RF) model in the 10- to 12-hour window, with an AUROC of 0.903, was selected for explanation using SHapley Additive exPlanations. Fifteen ICU clinicians assessed four distinct types of explanations ('Why', 'Why not', 'How to', and 'What if') via web-based experiments, surveys, and interviews.

RESULTS

Clinicians' feature selections aligned more closely with the RF model after explanations, as demonstrated by an increase in Spearman correlation from -0.147 (p = 0.275) to 0.868 (p < 0.001). The average trust score improved from 2.8 to 3.9. The average satisfaction scores for the 'Why', 'Why not', 'How to', and 'What if' explanations were 3.3, 3.8, 3.6, and 4.1, respectively.

CONCLUSION

AI model explanations notably enhanced clinicians' understanding and trust in AI-generated ICU LOS predictions, although complete alignment with their mental models was not achieved. Further refinement of AI model explanations is needed to support better clinician-AI collaboration and its integration into clinical practice.

摘要

背景

可解释人工智能（XAI）在医疗保健领域日益重要，临床医生需要理解并信任人工智能生成的建议。然而，人工智能模型解释对临床决策的影响仍未得到充分探索。

目的

评估人工智能模型解释如何影响临床医生对基于机器学习的重症监护病房（ICU）住院时长（LOS）预测的心理模型、信任度和满意度。

方法

这项回顾性混合方法研究分析了2019年至2022年期间韩国一家外科ICU收治的8579例患者的电子健康记录数据。开发并评估了7种机器学习模型，以预测入院后最初12小时内每隔2小时的ICU住院时长。选择在10至12小时窗口内AUROC为0.903的随机森林（RF）模型，使用SHapley加性解释进行解释。15名ICU临床医生通过基于网络的实验、调查和访谈，评估了四种不同类型的解释（“为什么”、“为什么不”、“如何做”和“如果……会怎样”）。

结果

解释后，临床医生的特征选择与RF模型的一致性更高，斯皮尔曼相关性从-0.147（p = 0.275）提高到0.868（p < 0.001）。平均信任得分从2.8提高到3.9。“为什么”、“为什么不”、“如何做”和“如果……会怎样”解释的平均满意度得分分别为3.3、3.8、3.6和4.1。

结论

人工智能模型解释显著增强了临床医生对人工智能生成的ICU住院时长预测的理解和信任，尽管未完全与他们的心理模型一致。需要进一步完善人工智能模型解释，以支持更好的临床医生与人工智能协作，并将其融入临床实践。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估可解释人工智能对临床医生决策的影响：一项关于重症监护病房住院时长预测的研究。

Evaluating the impact of explainable AI on clinicians' decision-making: A study on ICU length of stay prediction.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVES

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

评估可解释人工智能对临床医生决策的影响：一项关于重症监护病房住院时长预测的研究。

Evaluating the impact of explainable AI on clinicians' decision-making: A study on ICU length of stay prediction.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVES

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献