Skryd Anthony, Lawrence Katharine
Department of Medicine, NYU Langone Health, New York City, NY, United States.
Department of Population Health, NYU Grossman School of Medicine, New York City, NY, United States.
JMIR Form Res. 2024 May 8;8:e51346. doi: 10.2196/51346.
Large language models (LLMs) are computational artificial intelligence systems with advanced natural language processing capabilities that have recently been popularized among health care students and educators due to their ability to provide real-time access to a vast amount of medical knowledge. The adoption of LLM technology into medical education and training has varied, and little empirical evidence exists to support its use in clinical teaching environments.
The aim of the study is to identify and qualitatively evaluate potential use cases and limitations of LLM technology for real-time ward-based educational contexts.
A brief, single-site exploratory evaluation of the publicly available ChatGPT-3.5 (OpenAI) was conducted by implementing the tool into the daily attending rounds of a general internal medicine inpatient service at a large urban academic medical center. ChatGPT was integrated into rounds via both structured and organic use, using the web-based "chatbot" style interface to interact with the LLM through conversational free-text and discrete queries. A qualitative approach using phenomenological inquiry was used to identify key insights related to the use of ChatGPT through analysis of ChatGPT conversation logs and associated shorthand notes from the clinical sessions.
Identified use cases for ChatGPT integration included addressing medical knowledge gaps through discrete medical knowledge inquiries, building differential diagnoses and engaging dual-process thinking, challenging medical axioms, using cognitive aids to support acute care decision-making, and improving complex care management by facilitating conversations with subspecialties. Potential additional uses included engaging in difficult conversations with patients, exploring ethical challenges and general medical ethics teaching, personal continuing medical education resources, developing ward-based teaching tools, supporting and automating clinical documentation, and supporting productivity and task management. LLM biases, misinformation, ethics, and health equity were identified as areas of concern and potential limitations to clinical and training use. A code of conduct on ethical and appropriate use was also developed to guide team usage on the wards.
Overall, ChatGPT offers a novel tool to enhance ward-based learning through rapid information querying, second-order content exploration, and engaged team discussion regarding generated responses. More research is needed to fully understand contexts for educational use, particularly regarding the risks and limitations of the tool in clinical settings and its impacts on trainee development.
大语言模型(LLMs)是具有先进自然语言处理能力的计算人工智能系统,由于其能够实时提供大量医学知识,最近在医学生和教育工作者中受到欢迎。大语言模型技术在医学教育和培训中的应用各不相同,几乎没有实证证据支持其在临床教学环境中的使用。
本研究的目的是识别并定性评估大语言模型技术在基于病房的实时教育环境中的潜在用例和局限性。
通过将公开可用的ChatGPT-3.5(OpenAI)工具应用于一家大型城市学术医疗中心的普通内科住院病房的日常查房,进行了一项简短的单中心探索性评估。ChatGPT通过结构化和自然的方式被整合到查房中,使用基于网络的“聊天机器人”风格界面,通过对话式自由文本和离散查询与大语言模型进行交互。采用现象学探究的定性方法,通过分析ChatGPT对话记录和临床会议的相关速记笔记,来识别与使用ChatGPT相关的关键见解。
确定的ChatGPT整合用例包括通过离散的医学知识查询填补医学知识空白、建立鉴别诊断并进行双流程思维、挑战医学公理、使用认知辅助工具支持急性护理决策,以及通过促进与专科的对话改善复杂护理管理。潜在的其他用途包括与患者进行困难对话、探索伦理挑战和进行一般医学伦理教学、个人继续医学教育资源、开发基于病房的教学工具、支持并自动化临床文档记录,以及支持工作效率和任务管理。大语言模型的偏差、错误信息、伦理和健康公平被确定为临床和培训使用中值得关注的领域和潜在限制。还制定了关于道德和适当使用的行为准则,以指导病房团队的使用。
总体而言,ChatGPT提供了一种新颖的工具,可通过快速信息查询、二阶内容探索以及就生成的回复进行团队讨论来加强基于病房的学习。需要更多研究来全面了解教育用途的背景情况,特别是该工具在临床环境中的风险和局限性及其对实习生发展的影响。