Brankovic Aida, Cook David, Rahman Jessica, Khanna Sankalp, Huang Wenjie
CSIRO Australian e-Health Research Centre, Brisbane, QLD, Australia.
Intensive Care Unit, Princess Alexandra Hospital, Brisbane, QLD, Australia.
Health Informatics J. 2024 Oct-Dec;30(4):14604582241304730. doi: 10.1177/14604582241304730.
This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.
Two popular XAIs used for explaining clinical predictive models were evaluated based on their ability to generate domain-appropriate representations, impact clinical workflow, and consistency. Explanations were benchmarked against true clinical deterioration triggers recorded in the data system and agreement was quantified. The evaluation was conducted using two Electronic Medical Records datasets from major hospitals in Australia. Results were examined and commented on by a senior clinician.
Findings demonstrate a violation of consistency criteria and moderate concordance (0.47-0.8) with true triggers, undermining reliability and actionability, criteria for clinicians' trust in XAI.
Explanations are not trustworthy to guide clinical interventions, though they may offer useful insights and help model troubleshooting. Clinician-informed XAI development and presentation, clear disclaimers on limitations, and critical clinical judgment can promote informed decisions and prevent over-reliance.
本研究旨在评估用于解释临床预测模型的可解释人工智能(XAI)方法的实用性和可信度。
基于两种用于解释临床预测模型的流行XAI生成领域适当表征的能力、对临床工作流程的影响以及一致性,对其进行评估。将解释与数据系统中记录的真实临床恶化触发因素进行基准对比,并对一致性进行量化。使用来自澳大利亚主要医院的两个电子病历数据集进行评估。结果由一位资深临床医生进行检查和评论。
研究结果表明违反了一致性标准,与真实触发因素的一致性中等(0.47 - 0.8),这削弱了可靠性和可操作性,而可靠性和可操作性是临床医生对XAI信任的标准。
尽管解释可能提供有用的见解并有助于模型故障排除,但它们不足以指导临床干预。由临床医生参与的XAI开发和展示、对局限性的明确免责声明以及关键的临床判断可以促进明智的决策并防止过度依赖。