Pak Richard, Rovira Ericka, McLaughlin Anne
Department of Psychology, Clemson University, Clemson, SC, USA.
Department of Behavioral Sciences & Leadership, United States Military Academy, West Point, NY, USA.
Ergonomics. 2024 Nov 28:1-11. doi: 10.1080/00140139.2024.2434604.
With their increased capability, AI-based chatbots have become increasingly popular tools to help users answer complex queries. However, these chatbots may hallucinate, or generate incorrect but very plausible-sounding information, more frequently than previously thought. Thus, it is crucial to examine strategies to mitigate human susceptibility to hallucinated output. In a between-subjects experiment, participants completed a difficult quiz with assistance from either a polite or neutral-toned AI chatbot, which occasionally provided hallucinated (incorrect) information. Signal detection analysis revealed that participants interacting with polite-AI showed modestly higher sensitivity in detecting hallucinations and a more conservative response bias compared to those interacting with neutral-toned AI. While the observed effect sizes were modest, even small improvements in users' ability to detect AI hallucinations can have significant consequences, particularly in high-stakes domains or when aggregated across millions of AI interactions.
随着能力的提升,基于人工智能的聊天机器人已成为帮助用户解答复杂问题的越来越受欢迎的工具。然而,这些聊天机器人可能比之前认为的更频繁地产生幻觉,即生成不正确但听起来非常可信的信息。因此,研究减轻人类对幻觉输出易感性的策略至关重要。在一项被试间实验中,参与者在礼貌或中性语气的人工智能聊天机器人的帮助下完成了一项难度较大的测验,该聊天机器人偶尔会提供幻觉(错误)信息。信号检测分析表明,与使用中性语气人工智能的参与者相比,与礼貌人工智能交互的参与者在检测幻觉时表现出略高的敏感性和更保守的反应偏差。虽然观察到的效应量不大,但即使是用户检测人工智能幻觉能力的微小提高也可能产生重大影响,特别是在高风险领域或当数百万次人工智能交互汇总时。