激励与责任：快速工程如何影响 ChatGPT-4 在放射科考试中的表现。

Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance.

机构信息

University of Massachusetts Chan Medical School, Worcester, MA, United States of America.

Department of Radiology, University of Massachusetts Chan Medical School, Worcester, MA, United States of America.

出版信息

Clin Imaging. 2024 Nov;115:110276. doi: 10.1016/j.clinimag.2024.110276. Epub 2024 Sep 6.

DOI:10.1016/j.clinimag.2024.110276

PMID:39288636

Abstract

Large Language Models (LLM) like ChatGPT-4 hold significant promise in medical application, especially in the field of radiology. While previous studies have shown the promise of ChatGTP-4 in textual-based scenarios, its performance on image-based response remains suboptimal. This study investigates the impact of prompt engineering on ChatGPT-4's accuracy on the 2022 American College of Radiology In Training Test Questions for Diagnostic Radiology Residents that include textual and visual-based questions. Four personas were created, each with unique prompts, and evaluated using ChatGPT-4. Results indicate that encouraging prompts and those disclaiming responsibility led to higher overall accuracy (number of questions answered correctly) compared to other personas. Personas that threaten the LLM with legal action or mounting clinical responsibility were not only found to score less, but also refrain of answering questions at a higher rate. These findings highlight the importance of prompt context in optimizing LLM responses and the need for further research to integrate AI responsibly into medical practice.

摘要

大型语言模型（LLM），如 ChatGPT-4，在医学应用中具有巨大的潜力，尤其是在放射学领域。虽然之前的研究已经表明 ChatGPT-4 在基于文本的场景中具有很大的潜力，但其在基于图像的响应方面的性能仍不理想。本研究探讨了提示工程对 ChatGPT-4 在包括基于文本和基于图像的问题的 2022 年美国放射学学院住院医师培训测试问题中的准确性的影响。创建了四个角色，每个角色都有独特的提示，并使用 ChatGPT-4 进行评估。结果表明，与其他角色相比，鼓励提示和免责提示导致了更高的整体准确性（正确回答的问题数量）。那些用法律行动或临床责任威胁 LLM 的角色不仅得分较低，而且回答问题的比例也更高。这些发现强调了提示上下文在优化 LLM 响应中的重要性，以及需要进一步研究将人工智能负责任地整合到医学实践中的必要性。

相似文献

Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance.激励与责任：快速工程如何影响 ChatGPT-4 在放射科考试中的表现。

Clin Imaging. 2024 Nov;115:110276. doi: 10.1016/j.clinimag.2024.110276. Epub 2024 Sep 6.

Performance of GPT-4 with Vision on Text- and Image-based ACR Diagnostic Radiology In-Training Examination Questions.GPT-4 在基于文本和图像的放射科住院医师诊断考试中的表现。

Radiology. 2024 Sep;312(3):e240153. doi: 10.1148/radiol.240153.

Performance of GPT-4 on the American College of Radiology In-training Examination: Evaluating Accuracy, Model Drift, and Fine-tuning.GPT-4 在美国放射学院实习考试中的表现：评估准确性、模型漂移和微调。

Acad Radiol. 2024 Jul;31(7):3046-3054. doi: 10.1016/j.acra.2024.04.006. Epub 2024 Apr 22.

Assessment of ChatGPT-4 in Family Medicine Board Examinations Using Advanced AI Learning and Analytical Methods: Observational Study.使用高级 AI 学习和分析方法评估 ChatGPT-4 在家庭医学委员会考试中的表现：观察性研究。

JMIR Med Educ. 2024 Oct 8;10:e56128. doi: 10.2196/56128.

Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.人工智能能通过美国骨科医师学会考试吗？骨科住院医师与ChatGPT的对比。

Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.

Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性：视觉数据整合的影响。

JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.

The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.人工智能的快速发展：GPT-4 在骨科手术委员会问题上的表现。

Orthopedics. 2024 Mar-Apr;47(2):e85-e89. doi: 10.3928/01477447-20230922-05. Epub 2023 Sep 27.

Could ChatGPT Pass the UK Radiology Fellowship Examinations?ChatGPT 能通过英国放射科医师研究员考试吗？

Acad Radiol. 2024 May;31(5):2178-2182. doi: 10.1016/j.acra.2023.11.026. Epub 2023 Dec 29.

Comparative Performance of ChatGPT 3.5 and GPT4 on Rhinology Standardized Board Examination Questions.ChatGPT 3.5与GPT4在鼻科学标准化委员会考试问题上的比较表现

OTO Open. 2024 Jun 27;8(2):e164. doi: 10.1002/oto2.164. eCollection 2024 Apr-Jun.

ChatGPT Earns American Board Certification in Hand Surgery.ChatGPT 获得美国手部外科委员会认证。

Hand Surg Rehabil. 2024 Jun;43(3):101688. doi: 10.1016/j.hansur.2024.101688. Epub 2024 Mar 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

激励与责任：快速工程如何影响 ChatGPT-4 在放射科考试中的表现。

Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献