Kim Su Hwan, Schramm Severin, Riedel Evamaria Olga, Schmitzer Lena, Rosenkranz Enrike, Kertels Olivia, Bodden Jannis, Paprottka Karolin, Sepp Dominik, Renz Martin, Kirschke Jan, Baum Thomas, Maegerlein Christian, Boeckh-Behrens Tobias, Zimmer Claus, Wiestler Benedikt, Hedderich Dennis M
Institute of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, School of Medicine and Health, Technical University of Munich, Munich, Germany.
Radiol Med. 2025 Apr;130(4):555-566. doi: 10.1007/s11547-025-01964-6. Epub 2025 Feb 12.
To determine how automation bias (inclination of humans to overly trust-automated decision-making systems) can affect radiologists when interpreting AI-detected cerebral aneurysm findings in time-of-flight magnetic resonance angiography (TOF-MRA) studies.
Nine radiologists with varying levels of experience evaluated twenty TOF-MRA examinations for the presence of cerebral aneurysms. Every case was evaluated with and without assistance by the AI software © mdbrain, with a washout period of at least four weeks in-between. Half of the cases included at least one false-positive AI finding. Aneurysm ratings, follow-up recommendations, and reading times were assessed using the Wilcoxon signed-rank test.
False-positive AI results led to significantly higher suspicion of aneurysm findings (p = 0.01). Inexperienced readers further recommended significantly more intense follow-up examinations when presented with false-positive AI findings (p = 0.005). Reading times were significantly shorter with AI assistance in inexperienced (164.1 vs 228.2 s; p < 0.001), moderately experienced (126.2 vs 156.5 s; p < 0.009), and very experienced (117.9 vs 153.5 s; p < 0.001) readers alike.
Our results demonstrate the susceptibility of radiology readers to automation bias in detecting cerebral aneurysms in TOF-MRA studies when encountering false-positive AI findings. While AI systems for cerebral aneurysm detection can provide benefits, challenges in human-AI interaction need to be mitigated to ensure safe and effective adoption.
确定自动化偏差(人类过度信任自动化决策系统的倾向)在解读飞行时间磁共振血管造影(TOF-MRA)研究中人工智能检测到的脑动脉瘤结果时如何影响放射科医生。
九名经验水平各异的放射科医生对20例TOF-MRA检查进行脑动脉瘤检测。每例病例均在有无人工智能软件©mdbrain辅助的情况下进行评估,两次评估之间的洗脱期至少为四周。一半的病例包含至少一个假阳性人工智能检测结果。使用Wilcoxon符号秩检验评估动脉瘤分级、随访建议和阅读时间。
人工智能的假阳性结果导致对动脉瘤结果的怀疑显著增加(p = 0.01)。经验不足的阅片者在面对人工智能假阳性结果时,进一步推荐的随访检查强度明显更高(p = 0.005)。在经验不足(164.1秒对228.2秒;p < 0.001)、经验中等(126.2秒对156.5秒;p < 0.009)和经验丰富(117.9秒对153.5秒;p < 0.001)的阅片者中,有人工智能辅助时的阅读时间均显著缩短。
我们的结果表明,在TOF-MRA研究中检测脑动脉瘤时,当遇到人工智能假阳性结果时,放射科阅片者易受自动化偏差影响。虽然用于检测脑动脉瘤的人工智能系统有诸多益处,但需要缓解人机交互中的挑战,以确保安全有效地应用。