McCradden Melissa D, Thai Kelly, Assadi Azadeh, Tonekaboni Sana, Stedman Ian, Joshi Shalmali, Zhang Minfan, Chevalier Fanny, Goldenberg Anna
The Hospital for Sick Children, Toronto, Ontario, Canada
SickKids Research Institute, Toronto, Ontario, Canada.
BMJ Evid Based Med. 2025 May 20;30(3):183-193. doi: 10.1136/bmjebm-2024-112919.
OBJECTIVE: To develop a framework for good clinical decision-making using machine learning (ML) models for interventional, patient-level decisions. DESIGN: Grounded theory qualitative interview study. SETTING: Primarily single-site at a major urban academic paediatric hospital, with external sampling. PARTICIPANTS: Sixteen participants representing physicians (n=10), nursing (n=3), respiratory therapists (n=2) and an ML specialist (n=1) with experience working in acute care environments were identified through purposive sampling. Individuals were recruited to represent a spectrum of ML knowledge (three expert, four knowledgeable and nine non-expert) and years of experience (median=12.9 years postgraduation). Recruitment proceeded through snowball sampling, with individuals approached to represent a diversity of fields, levels of experience and attitudes towards artificial intelligence (AI)/ML. A member check step and consultation with patients was undertaken to vet the framework, which resulted in some minor revisions to the wording and framing. INTERVENTIONS: A semi-structured virtual interview simulating an intensive care unit handover for a hypothetical patient case using a simulated ML model and seven visualisations using known methods addressing interpretability of models in healthcare. Participants were asked to make an initial care plan for the patient, then were presented with a model prediction followed by the seven visualisations to explore their judgement and potential influence and understanding of the visualisations. Two visualisations contained contradicting information to probe participants' resolution process for the contrasting information. The ethical justifiability and clinical reasoning process were explored. MAIN OUTCOME: A comprehensive framework was developed that is grounded in established medicolegal and ethical standards and accounts for the incorporation of inference from ML models. RESULTS: We found that for making good decisions, participants reflected across six main categories: evidence, facts and medical knowledge relevant to the patient's condition; how that knowledge may be applied to this particular patient; patient-level, family-specific and local factors; facts about the model, its development and testing; the patient-level knowledge sufficiently represented by the model; the model's incorporation of relevant contextual factors. This judgement was centred on and anchored most heavily on the overall balance of benefits and risks to the patient, framed by the goals of care. We found evidence of automation bias, with many participants assuming that if the model's explanation conflicted with their prior knowledge that their judgement was incorrect; others concluded the exact opposite, drawing from their medical knowledge base to reject the incorrect information provided in the explanation. Regarding knowledge about the model, we found that participants most consistently wanted to know about the model's historical performance in the cohort of patients in their local unit where the hypothetical patient was situated. CONCLUSION: Good decisions using AI tools require reflection across multiple domains. We provide an actionable framework and question guide to support clinical decision-making with AI.
目的:开发一个用于干预性患者层面决策的机器学习(ML)模型的良好临床决策框架。 设计:扎根理论定性访谈研究。 背景:主要在一家大型城市学术儿科医院的单一地点进行,采用外部抽样。 参与者:通过目的抽样确定了16名参与者,包括医生(n = 10)、护士(n = 3)、呼吸治疗师(n = 2)和一名有急性护理环境工作经验的ML专家(n = 1)。招募的个体代表了一系列的ML知识水平(三名专家、四名有一定知识的人和九名非专家)以及工作年限(毕业后中位数 = 12.9年)。通过滚雪球抽样进行招募,所接触的个体代表了不同领域、经验水平和对人工智能(AI)/ML的态度。进行了成员核对步骤并与患者进行了咨询以审核该框架,这导致对措辞和框架进行了一些小的修订。 干预措施:进行一次半结构化虚拟访谈,使用模拟的ML模型和七种利用已知方法解决医疗保健中模型可解释性的可视化工具,模拟为一个假设患者病例进行重症监护病房的交接班情况。要求参与者为患者制定初步护理计划,然后向他们展示模型预测结果,接着展示七种可视化工具,以探究他们的判断以及可视化工具的潜在影响和理解。两种可视化工具包含相互矛盾的信息,以探究参与者对矛盾信息的解决过程。探讨了伦理合理性和临床推理过程。 主要结果:开发了一个基于既定法医学和伦理标准并考虑了ML模型推理纳入情况的综合框架。 结果:我们发现,为了做出良好决策,参与者从六个主要类别进行了思考:与患者病情相关的证据、事实和医学知识;该知识如何应用于这个特定患者;患者层面、家庭特定和当地因素;关于模型、其开发和测试的事实;模型充分代表的患者层面知识;模型对相关背景因素的纳入。这种判断以患者的利益和风险的总体平衡为中心并主要基于此,以护理目标为框架。我们发现了自动化偏差的证据,许多参与者认为如果模型的解释与他们的先验知识相冲突,那么他们的判断就是错误的;其他人则得出了相反的结论,从他们的医学知识库中得出结论,拒绝解释中提供的错误信息。关于模型的知识,我们发现参与者最一致想了解的是模型在假设患者所在当地单位的患者队列中的历史表现。 结论:使用人工智能工具做出良好决策需要跨多个领域进行思考。我们提供了一个可操作的框架和问题指南,以支持使用人工智能进行临床决策。
BMJ Evid Based Med. 2025-5-20
Cochrane Database Syst Rev. 2022-6-15
Cochrane Database Syst Rev. 2025-5-7
Cochrane Database Syst Rev. 2025-6-11
Future Healthc J. 2024-9-19
NPJ Digit Med. 2023-10-20
PLOS Digit Health. 2022-2-17
Lancet Digit Health. 2021-11
Transl Psychiatry. 2021-6-18