Rezaeian Olya, Asan Onur, Bayrak Alparslan Emrah
Department of Systems and Enterprises, Stevens Institute of Technology, Hoboken, NJ, United States.
Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem, PA, United States.
Appl Ergon. 2025 Nov;129:104577. doi: 10.1016/j.apergo.2025.104577. Epub 2025 Jun 26.
Advances in machine learning have created new opportunities to develop artificial intelligence (AI)-based clinical decision support systems using past clinical data and improve diagnosis decisions in life-threatening illnesses such breast cancer. Providing explanations for AI recommendations is a possible way to address trust and usability issues in black-box AI systems. This paper presents the results of an experiment to assess the impact of varying levels of AI explanations on clinicians' trust and diagnosis accuracy in a breast cancer application and the impact of demographics on the findings. The study includes 28 clinicians with varying medical roles related to breast cancer diagnosis. The results show that increasing levels of explanations do not always improve trust or diagnosis performance. The results also show that while some of the self-reported measures such as AI familiarity depend on gender, age and experience, the behavioral assessments of trust and performance are independent of those variables.
机器学习的进展为利用过往临床数据开发基于人工智能(AI)的临床决策支持系统创造了新机遇,有助于改善诸如乳腺癌等危及生命疾病的诊断决策。为人工智能推荐提供解释是解决黑箱人工智能系统中信任和可用性问题的一种可能方式。本文展示了一项实验的结果,该实验旨在评估不同程度的人工智能解释对乳腺癌应用中临床医生的信任度和诊断准确性的影响,以及人口统计学因素对研究结果的影响。该研究纳入了28名在乳腺癌诊断中担任不同医疗角色的临床医生。结果表明,解释程度的提高并不总是能提升信任度或诊断表现。结果还表明,虽然一些自我报告的指标(如对人工智能的熟悉程度)取决于性别、年龄和经验,但对信任和表现的行为评估与这些变量无关。