Suppr超能文献

初级住院医师与ChatGPT在病史采集和病历记录客观结构化临床考试(OSCE)中的表现比较:开发与可用性研究

Performance Comparison of Junior Residents and ChatGPT in the Objective Structured Clinical Examination (OSCE) for Medical History Taking and Documentation of Medical Records: Development and Usability Study.

作者信息

Huang Ting-Yun, Hsieh Pei Hsing, Chang Yung-Chun

机构信息

Shuang-Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.

Graduate Institute of Data Science, Taipei Medical University, Zhonghe District, New Taipei City, Taiwan.

出版信息

JMIR Med Educ. 2024 Nov 21;10:e59902. doi: 10.2196/59902.

Abstract

BACKGROUND

This study explores the cutting-edge abilities of large language models (LLMs) such as ChatGPT in medical history taking and medical record documentation, with a focus on their practical effectiveness in clinical settings-an area vital for the progress of medical artificial intelligence.

OBJECTIVE

Our aim was to assess the capability of ChatGPT versions 3.5 and 4.0 in performing medical history taking and medical record documentation in simulated clinical environments. The study compared the performance of nonmedical individuals using ChatGPT with that of junior medical residents.

METHODS

A simulation involving standardized patients was designed to mimic authentic medical history-taking interactions. Five nonmedical participants used ChatGPT versions 3.5 and 4.0 to conduct medical histories and document medical records, mirroring the tasks performed by 5 junior residents in identical scenarios. A total of 10 diverse scenarios were examined.

RESULTS

Evaluation of the medical documentation created by laypersons with ChatGPT assistance and those created by junior residents was conducted by 2 senior emergency physicians using audio recordings and the final medical records. The assessment used the Objective Structured Clinical Examination benchmarks in Taiwan as a reference. ChatGPT-4.0 exhibited substantial enhancements over its predecessor and met or exceeded the performance of human counterparts in terms of both checklist and global assessment scores. Although the overall quality of human consultations remained higher, ChatGPT-4.0's proficiency in medical documentation was notably promising.

CONCLUSIONS

The performance of ChatGPT 4.0 was on par with that of human participants in Objective Structured Clinical Examination evaluations, signifying its potential in medical history and medical record documentation. Despite this, the superiority of human consultations in terms of quality was evident. The study underscores both the promise and the current limitations of LLMs in the realm of clinical practice.

摘要

背景

本研究探讨了ChatGPT等大语言模型在病史采集和病历记录方面的前沿能力,重点关注其在临床环境中的实际有效性,这是医学人工智能发展的关键领域。

目的

我们的目的是评估ChatGPT 3.5版和4.0版在模拟临床环境中进行病史采集和病历记录的能力。该研究比较了使用ChatGPT的非医学专业人员与初级住院医师的表现。

方法

设计了一项涉及标准化病人的模拟实验,以模拟真实的病史采集互动。五名非医学参与者使用ChatGPT 3.5版和4.0版进行病史采集和病历记录,模仿五名初级住院医师在相同场景下执行的任务。共检查了10种不同的场景。

结果

两名资深急诊科医生通过录音和最终病历,对在ChatGPT协助下由非专业人员创建的医疗文档和初级住院医师创建的医疗文档进行了评估。评估以台湾的客观结构化临床考试基准为参考。ChatGPT-4.0相对于其前身有了显著改进,在清单和整体评估分数方面达到或超过了人类同行的表现。虽然人类问诊的整体质量仍然更高,但ChatGPT-4.0在医疗文档方面的熟练程度明显很有前景。

结论

ChatGPT 4.0在客观结构化临床考试评估中的表现与人类参与者相当,这表明其在病史和病历记录方面的潜力。尽管如此,人类问诊在质量方面的优势是显而易见的。该研究强调了大语言模型在临床实践领域的前景和当前的局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b84c/11612517/1585bdbc0ea8/mededu-v10-e59902-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验