Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA
Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, Texas, USA.
J Med Ethics. 2024 Jul 23;50(8):544-551. doi: 10.1136/jme-2023-109338.
Rapid advancements in artificial intelligence and machine learning (AI/ML) in healthcare raise pressing questions about how much users should trust AI/ML systems, particularly for high stakes clinical decision-making. Ensuring that user trust is properly calibrated to a tool's computational capacities and limitations has both practical and ethical implications, given that overtrust or undertrust can influence over-reliance or under-reliance on algorithmic tools, with significant implications for patient safety and health outcomes. It is, thus, important to better understand how variability in trust criteria across stakeholders, settings, tools and use cases may influence approaches to using AI/ML tools in real settings. As part of a 5-year, multi-institutional Agency for Health Care Research and Quality-funded study, we identify trust criteria for a survival prediction algorithm intended to support clinical decision-making for left ventricular assist device therapy, using semistructured interviews (n=40) with patients and physicians, analysed via thematic analysis. Findings suggest that physicians and patients share similar empirical considerations for trust, which were primarily in nature, focused on accuracy and validity of AI/ML estimates. Trust evaluations considered the nature, integrity and relevance of training data rather than the computational nature of algorithms themselves, suggesting a need to distinguish 'source' from 'functional' explainability. To a lesser extent, trust criteria were also relational (endorsement from others) and sometimes based on personal beliefs and experience. We discuss implications for promoting appropriate and responsible trust calibration for clinical decision-making use AI/ML.
人工智能和机器学习 (AI/ML) 在医疗保健领域的快速发展引发了一个紧迫的问题,即用户应该在多大程度上信任 AI/ML 系统,特别是在涉及高风险临床决策的情况下。确保用户信任与工具的计算能力和限制相匹配,具有实际和道德方面的影响,因为过度信任或不信任可能会影响对算法工具的过度依赖或依赖不足,这对患者安全和健康结果有重大影响。因此,重要的是要更好地了解利益相关者、环境、工具和用例之间信任标准的差异如何影响在实际环境中使用 AI/ML 工具的方法。作为一项为期五年、多机构的美国卫生保健研究与质量署资助研究的一部分,我们使用与患者和医生的半结构化访谈(n=40),通过主题分析,确定了用于支持左心室辅助设备治疗临床决策的生存预测算法的信任标准。研究结果表明,医生和患者对信任有相似的经验考虑因素,这些因素主要是基于 AI/ML 估计的准确性和有效性的自然因素。信任评估考虑了训练数据的性质、完整性和相关性,而不是算法本身的计算性质,这表明需要区分“来源”和“功能”可解释性。在较小程度上,信任标准也是关系性的(来自他人的认可),有时基于个人信仰和经验。我们讨论了为促进 AI/ML 用于临床决策制定的适当和负责任的信任校准所带来的影响。