医生使用人工智能聊天机器人进行临床决策的输入方法类型学：一项混合方法研究。

A typology of physician input approaches to using AI chatbots for clinical decision-making: a mixed methods study.

作者信息

Siden Rachel, Kerman Hannah, Gallo Robert J, Cool Joséphine A, Hom Jason, Goh Ethan, Ahuja Neera, Heidenreich Paul, Shieh Lisa, Yang Daniel, Chen Jonathan H, Rodman Adam, Holdsworth Laura M

机构信息

Department of Medicine, Stanford University School of Medicine, Palo Alto, CA, USA.

Division of Hospital Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.

出版信息

medRxiv. 2025 Jul 23:2025.07.23.25332002. doi: 10.1101/2025.07.23.25332002.

DOI:10.1101/2025.07.23.25332002

PMID:40778141

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12330456/

Abstract

BACKGROUND

Large language model (LLM) chatbots demonstrate high degrees of accuracy, yet recent studies found that physicians using these same chatbots may score no better to worse on clinical reasoning tests compared to the chatbot performing alone with researcher-curated prompts. It is unknown how physicians approach inputting information into chatbots.

OBJECTIVE

This study aimed to identify how physicians interacted with LLM chatbots on clinical reasoning tasks to create a typology of input approaches, exploring whether input approach type was associated with improved clinical reasoning performance.

METHODS

We carried out a mixed methods study in three steps. First, we conducted semi-structured interviews with U.S. physicians on experiences using an LLM chatbot and analyzed transcripts using the Framework Method to develop a typology based on input patterns. Next, we analyzed the chat logs of physicians who used a chatbot while solving clinical cases, categorizing each case to an input approach type. Lastly, we used a linear mixed-effects model to compare each input approach type with performance on the clinical cases.

RESULTS

We identified four input approach types based on patterns of "content amount": copy-paster (entire case), selective copy-paster (pieces of a case), summarizer (user-generated case summary), and searcher (short queries). Copy-pasting and searching were utilized most. No single type was associated with scoring higher on clinical cases.

DISCUSSION

This study adds to our understanding of how physicians approach using chatbots and identifies ways in which physicians intuitively interact with chatbots.

CONCLUSIONS

Purposeful training and support is needed to help physicians effectively use emerging AI technologies and realize their potential for supporting safe and effective medical decision-making in practice.

摘要

背景

大语言模型（LLM）聊天机器人显示出高度的准确性，但最近的研究发现，与仅使用研究人员精心设计的提示单独运行的聊天机器人相比，使用这些相同聊天机器人的医生在临床推理测试中的得分可能不会更好甚至更差。目前尚不清楚医生如何将信息输入到聊天机器人中。

目的

本研究旨在确定医生在临床推理任务中如何与LLM聊天机器人交互，以创建输入方法的类型学，探讨输入方法类型是否与改善的临床推理表现相关。

方法

我们分三步进行了一项混合方法研究。首先，我们对美国医生使用LLM聊天机器人的经历进行了半结构化访谈，并使用框架法分析了访谈记录，以根据输入模式开发一种类型学。接下来，我们分析了医生在解决临床病例时使用聊天机器人的聊天记录，将每个病例归类为一种输入方法类型。最后，我们使用线性混合效应模型将每种输入方法类型与临床病例的表现进行比较。