文献检索，用中文搜 PubMed

OBJECTIVES

This study investigated the impact of human-large language model (LLM) collaboration on the accuracy and efficiency of brain MRI differential diagnosis.

MATERIALS AND METHODS

In this retrospective study, forty brain MRI cases with a challenging but definitive diagnosis were randomized into two groups of twenty cases each. Six radiology residents with an average experience of 6.3 months in reading brain MRI exams evaluated one set of cases supported by conventional internet search (Conventional) and the other set utilizing an LLM-based search engine and hybrid chatbot. A cross-over design ensured that each case was examined with both workflows in equal frequency. For each case, readers were instructed to determine the three most likely differential diagnoses. LLM responses were analyzed by a panel of radiologists. Benefits and challenges in human-LLM interaction were derived from observations and participant feedback.

RESULTS

LLM-assisted brain MRI differential diagnosis yielded superior accuracy (70/114; 61.4% (LLM-assisted) vs 53/114; 46.5% (conventional) correct diagnoses, p = 0.033, chi-square test). No difference in interpretation time or level of confidence was observed. An analysis of LLM responses revealed that correct LLM suggestions translated into correct reader responses in 82.1% of cases (60/73). Inaccurate case descriptions by readers (9.2% of cases), LLM hallucinations (11.5% of cases), and insufficient contextualization of LLM responses were identified as challenges related to human-LLM interaction.

CONCLUSION

Human-LLM collaboration has the potential to improve brain MRI differential diagnosis. Yet, several challenges must be addressed to ensure effective adoption and user acceptance.

KEY POINTS

Question While large language models (LLM) have the potential to support radiological differential diagnosis, the role of human-LLM collaboration in this context remains underexplored. Findings LLM-assisted brain MRI differential diagnosis yielded superior accuracy over conventional internet search. Inaccurate case descriptions, LLM hallucinations, and insufficient contextualization were identified as potential challenges. Clinical relevance Our results highlight the potential of an LLM-assisted workflow to increase diagnostic accuracy but underline the necessity to study collaborative efforts between humans and LLMs over LLMs in isolation.

OBJECTIVES

This study investigated the impact of human-large language model (LLM) collaboration on the accuracy and efficiency of brain MRI differential diagnosis.

MATERIALS AND METHODS

RESULTS

CONCLUSION

Human-LLM collaboration has the potential to improve brain MRI differential diagnosis. Yet, several challenges must be addressed to ensure effective adoption and user acceptance.

KEY POINTS

目的

本研究调查了人类与大语言模型（LLM）协作对脑MRI鉴别诊断准确性和效率的影响。

材料与方法

在这项回顾性研究中，40例具有挑战性但诊断明确的脑MRI病例被随机分为两组，每组20例。6名平均有6.3个月阅读脑MRI检查经验的放射科住院医师评估了一组由传统互联网搜索支持的病例（传统组）和另一组使用基于LLM的搜索引擎和混合聊天机器人的病例。交叉设计确保每个病例以两种工作流程进行检查的频率相同。对于每个病例，要求读者确定三种最可能的鉴别诊断。LLM的回答由一组放射科医生进行分析。人类与LLM交互中的益处和挑战来自观察结果和参与者反馈。

结果

LLM辅助的脑MRI鉴别诊断产生了更高的准确性（70/114；61.4%（LLM辅助）对53/114；46.5%（传统组）正确诊断，p = 0.033，卡方检验）。在解读时间或信心水平上未观察到差异。对LLM回答的分析表明，在82.1%的病例（60/73）中，LLM的正确建议转化为读者的正确回答。读者对病例描述不准确（9.2%的病例）、LLM产生幻觉（11.5%的病例）以及LLM回答的背景信息不足被确定为与人类-LLM交互相关的挑战。

结论

人类与LLM协作有潜力改善脑MRI鉴别诊断。然而，必须解决几个挑战，以确保有效采用和用户接受。

关键点

问题虽然大语言模型（LLM）有潜力支持放射学鉴别诊断，但人类与LLM协作在这种情况下的作用仍未得到充分探索。发现 LLM辅助的脑MRI鉴别诊断比传统互联网搜索具有更高的准确性。不准确的病例描述、LLM产生幻觉和背景信息不足被确定为潜在挑战。临床相关性我们的结果突出了LLM辅助工作流程提高诊断准确性的潜力，但强调了研究人类与LLM之间协作努力而非单独研究LLM必要性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

大语言模型辅助脑磁共振成像鉴别诊断中的人机协作：一项可用性研究

Human-AI collaboration in large language model-assisted brain MRI differential diagnosis: a usability study.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

KEY POINTS

相似文献

引用本文的文献

本文引用的文献

大语言模型辅助脑磁共振成像鉴别诊断中的人机协作：一项可用性研究

Human-AI collaboration in large language model-assisted brain MRI differential diagnosis: a usability study.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

KEY POINTS

目的

材料与方法

结果

结论

关键点

相似文献

引用本文的文献

本文引用的文献