对用于解释临床预测模型的最流行的可解释人工智能方法进行基准测试：不可信但可能有用。

Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.

作者信息

Brankovic Aida, Cook David, Rahman Jessica, Khanna Sankalp, Huang Wenjie

机构信息

CSIRO Australian e-Health Research Centre, Brisbane, QLD, Australia.

Intensive Care Unit, Princess Alexandra Hospital, Brisbane, QLD, Australia.

出版信息

Health Informatics J. 2024 Oct-Dec;30(4):14604582241304730. doi: 10.1177/14604582241304730.

DOI:10.1177/14604582241304730

PMID:39716973

Abstract

OBJECTIVE

This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.

METHODS

Two popular XAIs used for explaining clinical predictive models were evaluated based on their ability to generate domain-appropriate representations, impact clinical workflow, and consistency. Explanations were benchmarked against true clinical deterioration triggers recorded in the data system and agreement was quantified. The evaluation was conducted using two Electronic Medical Records datasets from major hospitals in Australia. Results were examined and commented on by a senior clinician.

RESULTS

Findings demonstrate a violation of consistency criteria and moderate concordance (0.47-0.8) with true triggers, undermining reliability and actionability, criteria for clinicians' trust in XAI.

CONCLUSION

Explanations are not trustworthy to guide clinical interventions, though they may offer useful insights and help model troubleshooting. Clinician-informed XAI development and presentation, clear disclaimers on limitations, and critical clinical judgment can promote informed decisions and prevent over-reliance.

摘要

目的

本研究旨在评估用于解释临床预测模型的可解释人工智能（XAI）方法的实用性和可信度。

方法

基于两种用于解释临床预测模型的流行XAI生成领域适当表征的能力、对临床工作流程的影响以及一致性，对其进行评估。将解释与数据系统中记录的真实临床恶化触发因素进行基准对比，并对一致性进行量化。使用来自澳大利亚主要医院的两个电子病历数据集进行评估。结果由一位资深临床医生进行检查和评论。

结果

研究结果表明违反了一致性标准，与真实触发因素的一致性中等（0.47 - 0.8），这削弱了可靠性和可操作性，而可靠性和可操作性是临床医生对XAI信任的标准。

结论

尽管解释可能提供有用的见解并有助于模型故障排除，但它们不足以指导临床干预。由临床医生参与的XAI开发和展示、对局限性的明确免责声明以及关键的临床判断可以促进明智的决策并防止过度依赖。

相似文献

Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.对用于解释临床预测模型的最流行的可解释人工智能方法进行基准测试：不可信但可能有用。

Health Informatics J. 2024 Oct-Dec;30(4):14604582241304730. doi: 10.1177/14604582241304730.

How Explainable Artificial Intelligence Can Increase or Decrease Clinicians' Trust in AI Applications in Health Care: Systematic Review.可解释人工智能如何增加或降低临床医生对医疗保健中人工智能应用的信任：系统评价

JMIR AI. 2024 Oct 30;3:e53207. doi: 10.2196/53207.

Investigating Protective and Risk Factors and Predictive Insights for Aboriginal Perinatal Mental Health: Explainable Artificial Intelligence Approach.探究原住民围产期心理健康的保护因素、风险因素及预测性见解：可解释人工智能方法

J Med Internet Res. 2025 Apr 30;27:e68030. doi: 10.2196/68030.

Guidelines and evaluation of clinical explainable AI in medical image analysis.医学影像分析中临床可解释人工智能的指南和评估。

Med Image Anal. 2023 Feb;84:102684. doi: 10.1016/j.media.2022.102684. Epub 2022 Nov 17.

BenchXAI: Comprehensive benchmarking of post-hoc explainable AI methods on multi-modal biomedical data.BenchXAI：多模态生物医学数据上事后可解释人工智能方法的综合基准测试

Comput Biol Med. 2025 Jun;191:110124. doi: 10.1016/j.compbiomed.2025.110124. Epub 2025 Apr 15.

Explainable artificial intelligence in breast cancer detection and risk prediction: A systematic scoping review.乳腺癌检测与风险预测中的可解释人工智能：一项系统综述。

Cancer Innov. 2024 Jul 3;3(5):e136. doi: 10.1002/cai2.136. eCollection 2024 Oct.

Towards clinical prediction with transparency: An explainable AI approach to survival modelling in residential aged care.迈向具有透明度的临床预测：一种用于老年护理机构生存建模的可解释人工智能方法。

Comput Methods Programs Biomed. 2025 May;263:108653. doi: 10.1016/j.cmpb.2025.108653. Epub 2025 Feb 15.

Decoding the black box: Explainable AI (XAI) for cancer diagnosis, prognosis, and treatment planning-A state-of-the art systematic review.解码黑箱：癌症诊断、预后和治疗计划的可解释人工智能（XAI）——最新系统评价。

Int J Med Inform. 2025 Jan;193:105689. doi: 10.1016/j.ijmedinf.2024.105689. Epub 2024 Nov 4.

Systematic literature review on the application of explainable artificial intelligence in palliative care studies.关于可解释人工智能在姑息治疗研究中应用的系统文献综述。

Int J Med Inform. 2025 Aug;200:105914. doi: 10.1016/j.ijmedinf.2025.105914. Epub 2025 Apr 8.

Elucidating Discrepancy in Explanations of Predictive Models Developed Using EMR.阐明使用电子病历开发的预测模型解释中的差异。

Stud Health Technol Inform. 2024 Jan 25;310:865-869. doi: 10.3233/SHTI231088.

引用本文的文献

AI-driven analysis by identifying risk factors of VL relapse in HIV co-infected patients.通过识别合并感染HIV患者中VL复发的危险因素进行人工智能驱动的分析。

Sci Rep. 2025 Jul 1;15(1):21067. doi: 10.1038/s41598-025-07406-7.

Clinician-informed XAI evaluation checklist with metrics (CLIX-M) for AI-powered clinical decision support systems.用于人工智能驱动的临床决策支持系统的临床医生告知的可解释人工智能评估清单及指标（CLIX-M）

NPJ Digit Med. 2025 Jun 14;8(1):364. doi: 10.1038/s41746-025-01764-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

对用于解释临床预测模型的最流行的可解释人工智能方法进行基准测试：不可信但可能有用。

Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献