• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大型语言模型通过患者模拟和结构化反馈提高医学生的临床决策能力:一项随机对照试验。

Large language models improve clinical decision making of medical students through patient simulation and structured feedback: a randomized controlled trial.

机构信息

Connectome - Student Association for Neurosurgery, Neurology and Neurosciences, Berlin, Germany.

Department of Neurosurgery, University Hospital of Münster, Münster, Germany.

出版信息

BMC Med Educ. 2024 Nov 28;24(1):1391. doi: 10.1186/s12909-024-06399-7.

DOI:10.1186/s12909-024-06399-7
PMID:39609823
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11605890/
Abstract

BACKGROUND

Clinical decision-making (CDM) refers to physicians' ability to gather, evaluate, and interpret relevant diagnostic information. An integral component of CDM is the medical history conversation, traditionally practiced on real or simulated patients. In this study, we explored the potential of using Large Language Models (LLM) to simulate patient-doctor interactions and provide structured feedback.

METHODS

We developed AI prompts to simulate patients with different symptoms, engaging in realistic medical history conversations. In our double-blind randomized design, the control group participated in simulated medical history conversations with AI patients (control group), while the intervention group, in addition to simulated conversations, also received AI-generated feedback on their performances (feedback group). We examined the influence of feedback based on their CDM performance, which was evaluated by two raters (ICC = 0.924) using the Clinical Reasoning Indicator - History Taking Inventory (CRI-HTI). The data was analyzed using an ANOVA for repeated measures.

RESULTS

Our final sample included 21 medical students (age = 22.10 years, semester = 4, 14 females). At baseline, the feedback group (mean = 3.28 ± 0.09 [standard deviation]) and the control group (3.21 ± 0.08) achieved similar CRI-HTI scores, indicating successful randomization. After only four training sessions, the feedback group (3.60 ± 0.13) outperformed the control group (3.02 ± 0.12), F (1,18) = 4.44, p = .049 with a strong effect size, partial η = 0.198. Specifically, the feedback group showed improvements in the subdomains of CDM of creating context (p = .046) and securing information (p = .018), while their ability to focus questions did not improve significantly (p = .265).

CONCLUSION

The results suggest that AI-simulated medical history conversations can support CDM training, especially when combined with structured feedback. Such training format may serve as a cost-effective supplement to existing training methods, better preparing students for real medical history conversations.

摘要

背景

临床决策(CDM)是指医生收集、评估和解释相关诊断信息的能力。CDM 的一个组成部分是传统上在真实或模拟患者身上进行的病史对话。在这项研究中,我们探讨了使用大型语言模型(LLM)模拟医患互动并提供结构化反馈的潜力。

方法

我们开发了 AI 提示来模拟具有不同症状的患者,进行逼真的病史对话。在我们的双盲随机设计中,对照组与 AI 患者进行模拟病史对话(对照组),而干预组除了模拟对话外,还收到了基于他们表现的 AI 生成的反馈(反馈组)。我们根据他们的 CDM 表现来检查反馈的影响,这是由两位评估者(ICC=0.924)使用临床推理指标-病史采集清单(CRI-HTI)进行评估的。数据使用重复测量的 ANOVA 进行分析。

结果

我们的最终样本包括 21 名医学生(年龄=22.10 岁,学期=4,14 名女性)。在基线时,反馈组(均值=3.28±0.09[标准差])和对照组(3.21±0.08)的 CRI-HTI 得分相似,表明随机分组成功。仅经过四次培训后,反馈组(3.60±0.13)的表现优于对照组(3.02±0.12),F(1,18)=4.44,p=0.049,效应量较大,部分η=0.198。具体而言,反馈组在创建背景(p=0.046)和获取信息(p=0.018)方面的 CDM 子领域表现有所提高,而他们的聚焦问题能力没有显著提高(p=0.265)。

结论

结果表明,AI 模拟的病史对话可以支持 CDM 培训,尤其是当与结构化反馈结合使用时。这种培训形式可能成为现有培训方法的经济有效的补充,使学生更好地为真实的病史对话做好准备。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/76594fb24a45/12909_2024_6399_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/b4bd3bd32966/12909_2024_6399_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/b7144680e41b/12909_2024_6399_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/c2b8ff2b0c22/12909_2024_6399_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/76594fb24a45/12909_2024_6399_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/b4bd3bd32966/12909_2024_6399_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/b7144680e41b/12909_2024_6399_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/c2b8ff2b0c22/12909_2024_6399_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/11605890/76594fb24a45/12909_2024_6399_Fig4_HTML.jpg

相似文献

1
Large language models improve clinical decision making of medical students through patient simulation and structured feedback: a randomized controlled trial.大型语言模型通过患者模拟和结构化反馈提高医学生的临床决策能力:一项随机对照试验。
BMC Med Educ. 2024 Nov 28;24(1):1391. doi: 10.1186/s12909-024-06399-7.
2
A Language Model-Powered Simulated Patient With Automated Feedback for History Taking: Prospective Study.基于语言模型的模拟患者与自动化反馈的病史采集:前瞻性研究。
JMIR Med Educ. 2024 Aug 16;10:e59213. doi: 10.2196/59213.
3
Development of a GPT-4-Powered Virtual Simulated Patient and Communication Training Platform for Medical Students to Practice Discussing Abnormal Mammogram Results With Patients: Multiphase Study.开发一个由GPT-4驱动的虚拟模拟患者和沟通训练平台,供医学生练习与患者讨论异常乳房X光检查结果:多阶段研究。
JMIR Form Res. 2025 Apr 17;9:e65670. doi: 10.2196/65670.
4
ChatGPT versus expert feedback on clinical reasoning questions and their effect on learning: a randomized controlled trial.ChatGPT与专家反馈对临床推理问题的影响及其对学习的作用:一项随机对照试验
Postgrad Med J. 2025 Apr 22;101(1195):458-463. doi: 10.1093/postmj/qgae170.
5
Enhancing Medical Interview Skills Through AI-Simulated Patient Interactions: Nonrandomized Controlled Trial.通过人工智能模拟患者交互增强医学访谈技巧:非随机对照试验。
JMIR Med Educ. 2024 Sep 23;10:e58753. doi: 10.2196/58753.
6
Artificial Intelligence (AI)-Based simulators versus simulated patients in undergraduate programs: A protocol for a randomized controlled trial.基于人工智能的模拟器与本科生模拟患者比较:一项随机对照试验方案。
BMC Med Educ. 2024 Nov 5;24(1):1260. doi: 10.1186/s12909-024-06236-x.
7
Effect of Artificial Intelligence Tutoring vs Expert Instruction on Learning Simulated Surgical Skills Among Medical Students: A Randomized Clinical Trial.人工智能辅导与专家指导对医学生模拟手术技能学习效果的影响:一项随机临床试验。
JAMA Netw Open. 2022 Feb 1;5(2):e2149008. doi: 10.1001/jamanetworkopen.2021.49008.
8
How do medical students make sense of internal and external feedback to enhance their Dutch communication skills?医学生如何理解内部和外部反馈以提高他们的荷兰语沟通技巧?
BMC Med Educ. 2025 Feb 17;25(1):256. doi: 10.1186/s12909-025-06845-0.
9
Effects of interacting with a large language model compared with a human coach on the clinical diagnostic process and outcomes among fourth-year medical students: study protocol for a prospective, randomised experiment using patient vignettes.与大语言模型互动和与人类教练互动对四年级医学生临床诊断过程和结果的影响:一项使用病例简述的前瞻性、随机实验的研究方案。
BMJ Open. 2024 Jul 18;14(7):e087469. doi: 10.1136/bmjopen-2024-087469.
10
Specific feedback makes medical students better communicators.具体反馈能让医学生成为更好的沟通者。
BMC Med Educ. 2019 Feb 8;19(1):51. doi: 10.1186/s12909-019-1470-9.

引用本文的文献

1
Letter to the editor regarding: "comparative analysis of the performance of the large language models DeepSeek-V3, DeepSeek-R1, OpenAI o3-mini and OpenAI o3-mini high in urology".致编辑的信:关于“大语言模型DeepSeek-V3、DeepSeek-R1、OpenAI o3-mini和OpenAI o3-mini在泌尿外科领域的性能比较分析”
World J Urol. 2025 Jul 17;43(1):446. doi: 10.1007/s00345-025-05832-w.
2
Assessing the Accuracy of Diagnostic Capabilities of Large Language Models.评估大语言模型诊断能力的准确性。
Diagnostics (Basel). 2025 Jun 29;15(13):1657. doi: 10.3390/diagnostics15131657.
3
Feasibility study of using GPT for history-taking training in medical education: a randomized clinical trial.

本文引用的文献

1
Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.大语言模型对诊断推理的影响:一项随机临床试验。
JAMA Netw Open. 2024 Oct 1;7(10):e2440969. doi: 10.1001/jamanetworkopen.2024.40969.
2
Can AI-Generated Clinical Vignettes in Japanese Be Used Medically and Linguistically?日语人工智能生成的临床病例能否用于医学和语言方面?
J Gen Intern Med. 2024 Dec;39(16):3282-3289. doi: 10.1007/s11606-024-09031-y. Epub 2024 Sep 23.
3
A Language Model-Powered Simulated Patient With Automated Feedback for History Taking: Prospective Study.
在医学教育中使用GPT进行病史采集训练的可行性研究:一项随机临床试验。
BMC Med Educ. 2025 Jul 10;25(1):1030. doi: 10.1186/s12909-025-07614-9.
4
Comparative analysis of the performance of the large language models DeepSeek-V3, DeepSeek-R1, open AI-O3 mini and open AI-O3 mini high in urology.大语言模型DeepSeek-V3、DeepSeek-R1、open AI-O3 mini和open AI-O3 mini在泌尿外科领域的性能比较分析。
World J Urol. 2025 Jul 7;43(1):416. doi: 10.1007/s00345-025-05757-4.
5
Delving into the Practical Applications and Pitfalls of Large Language Models in Medical Education: Narrative Review.深入探讨大语言模型在医学教育中的实际应用与陷阱:叙述性综述
Adv Med Educ Pract. 2025 Apr 18;16:625-636. doi: 10.2147/AMEP.S497020. eCollection 2025.
6
Teaching opportunities for anamnesis interviews through AI based teaching role plays: a survey with online learning students from health study programs.通过基于人工智能的教学角色扮演进行问诊访谈的教学机会:对健康研究项目在线学习学生的一项调查
BMC Med Educ. 2025 Feb 18;25(1):259. doi: 10.1186/s12909-025-06756-0.
基于语言模型的模拟患者与自动化反馈的病史采集:前瞻性研究。
JMIR Med Educ. 2024 Aug 16;10:e59213. doi: 10.2196/59213.
4
ChatGPT-4 accuracy for patient education in laryngopharyngeal reflux.ChatGPT-4 在咽喉反流患者教育中的准确性。
Eur Arch Otorhinolaryngol. 2024 May;281(5):2547-2552. doi: 10.1007/s00405-024-08560-w. Epub 2024 Mar 16.
5
A Generative Pretrained Transformer (GPT)-Powered Chatbot as a Simulated Patient to Practice History Taking: Prospective, Mixed Methods Study.基于生成式预训练转换器(GPT)的聊天机器人作为模拟患者进行病史采集的实践研究:前瞻性混合方法研究。
JMIR Med Educ. 2024 Jan 16;10:e53961. doi: 10.2196/53961.
6
Systematic testing of three Language Models reveals low language accuracy, absence of response stability, and a yes-response bias.系统测试三种语言模型发现,它们语言准确性低,缺乏响应稳定性,并且存在肯定回答偏见。
Proc Natl Acad Sci U S A. 2023 Dec 19;120(51):e2309583120. doi: 10.1073/pnas.2309583120. Epub 2023 Dec 13.
7
Evaluating the performance of large language models in haematopoietic stem cell transplantation decision-making.评估大语言模型在造血干细胞移植决策中的性能。
Br J Haematol. 2024 Apr;204(4):1523-1528. doi: 10.1111/bjh.19200. Epub 2023 Dec 9.
8
ChatGPT Interactive Medical Simulations for Early Clinical Education: Case Study.用于早期临床教育的ChatGPT交互式医学模拟:案例研究
JMIR Med Educ. 2023 Nov 10;9:e49877. doi: 10.2196/49877.
9
Online patient education in body contouring: A comparison between Google and ChatGPT.网络形体塑造患者教育:谷歌与 ChatGPT 之间的比较。
J Plast Reconstr Aesthet Surg. 2023 Dec;87:390-402. doi: 10.1016/j.bjps.2023.10.091. Epub 2023 Oct 20.
10
Assessing the potential of ChatGPT for patient education in the cardiology clinic.评估ChatGPT在心脏病学诊所开展患者教育的潜力。
Prog Cardiovasc Dis. 2023 Nov-Dec;81:109-110. doi: 10.1016/j.pcad.2023.10.002. Epub 2023 Oct 11.