• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大型语言模型可以支持生成标准化的出院小结——一项利用 ChatGPT-4 和电子健康记录的回顾性研究。

Large language models can support generation of standardized discharge summaries - A retrospective study utilizing ChatGPT-4 and electronic health records.

机构信息

Centre for Depression, Anxiety Disorders and Psychotherapy, Psychiatric University Hospital Zurich (PUK), Zurich, Switzerland; Faculty of Medicine, University of Zurich (UZH), Zurich, Switzerland.

Center for Acute Psychiatry and Psychotherapy, Psychiatric University Hospital Zurich (PUK), Zurich, Switzerland; Faculty of Medicine, University of Zurich (UZH), Zurich, Switzerland.

出版信息

Int J Med Inform. 2024 Dec;192:105654. doi: 10.1016/j.ijmedinf.2024.105654. Epub 2024 Oct 14.

DOI:10.1016/j.ijmedinf.2024.105654
PMID:39437512
Abstract

OBJECTIVE

To evaluate whether psychiatric discharge summaries (DS) generated with ChatGPT-4 from electronic health records (EHR) can match the quality of DS written by psychiatric residents.

METHODS

At a psychiatric primary care hospital, we compared 20 inpatient DS, written by residents, to those written with ChatGPT-4 from pseudonymized residents' notes of the patients' EHRs and a standardized prompt. 8 blinded psychiatry specialists rated both versions on a custom Likert scale from 1 to 5 across 15 quality subcategories. The primary outcome was the overall rating difference between the two groups. The secondary outcomes were the rating differences at the level of individual question, case, and rater.

RESULTS

Human-written DS were rated significantly higher than AI (mean ratings: human 3.78, AI 3.12, p < 0.05). They surpassed AI significantly in 12/15 questions and 16/20 cases and were favored significantly by 7/8 raters. For "low expected correction effort", human DS were rated as 67 % favorable, 19 % neutral, and 14 % unfavorable, whereas AI-DS were rated as 22 % favorable, 33 % neutral, and 45 % unfavorable. Hallucinations were present in 40 % of AI-DS, with 37.5 % deemed highly clinically relevant. Minor content mistakes were found in 30 % of AI and 10 % of human DS. Raters correctly identified AI-DS with 81 % sensitivity and 75 % specificity.

DISCUSSION

Overall, AI-DS did not match the quality of resident-written DS but performed similarly in 20% of cases and were rated as favorable for "low expected correction effort" in 22% of cases. AI-DS lacked most in content specificity, ability to distill key case information, and coherence but performed adequately in conciseness, adherence to formalities, relevance of included content, and form.

CONCLUSION

LLM-written DS show potential as templates for physicians to finalize, potentially saving time in the future.

摘要

目的

评估使用 ChatGPT-4 从电子病历(EHR)生成的精神科出院小结(DS)是否能与精神科住院医师撰写的 DS 质量相匹配。

方法

在一家精神科初级保健医院,我们比较了 20 份由住院医师撰写的住院 DS,以及使用 ChatGPT-4 根据患者 EHR 中匿名住院医师记录和标准化提示生成的 DS。20 名盲法精神科专家使用定制的李克特量表(1 到 5 分)对这两个版本在 15 个质量子类别中的每个子类别进行评分。主要结果是两组之间的总体评分差异。次要结果是在个别问题、病例和评分者层面的评分差异。

结果

人工撰写的 DS 的评分明显高于 AI(平均评分:人工 3.78,AI 3.12,p<0.05)。在 12/15 个问题和 16/20 个病例中,人工撰写的 DS 明显优于 AI,在 7/8 名评分者中也明显受到青睐。对于“低预期修正难度”,人工 DS 的评分分别为 67%的有利、19%的中立和 14%的不利,而 AI-DS 的评分分别为 22%的有利、33%的中立和 45%的不利。AI-DS 中存在 40%的幻觉,其中 37.5%被认为具有高度临床相关性。AI 中发现 30%的内容有小错误,而人工 DS 中发现 10%的内容有小错误。评分者以 81%的灵敏度和 75%的特异性正确识别 AI-DS。

讨论

总的来说,AI-DS 的质量不如住院医师撰写的 DS,但在 20%的病例中表现相似,并且在 22%的病例中被评为“低预期修正难度”的有利。AI-DS 在内容特异性、提取关键病例信息的能力和连贯性方面表现最差,但在简洁性、形式上的遵循、包含内容的相关性和格式方面表现得足够好。

结论

LLM 撰写的 DS 有可能成为医生最终确定的模板,在未来可能会节省时间。

相似文献

1
Large language models can support generation of standardized discharge summaries - A retrospective study utilizing ChatGPT-4 and electronic health records.大型语言模型可以支持生成标准化的出院小结——一项利用 ChatGPT-4 和电子健康记录的回顾性研究。
Int J Med Inform. 2024 Dec;192:105654. doi: 10.1016/j.ijmedinf.2024.105654. Epub 2024 Oct 14.
2
Comparison of the Quality of Discharge Letters Written by Large Language Models and Junior Clinicians: Single-Blinded Study.大语言模型与初级临床医生撰写的出院小结质量比较:单盲研究
J Med Internet Res. 2024 Jul 24;26:e57721. doi: 10.2196/57721.
3
Harnessing the Power of Generative AI for Clinical Summaries: Perspectives From Emergency Physicians.利用生成式人工智能为临床总结提供助力:来自急诊医师的观点。
Ann Emerg Med. 2024 Aug;84(2):128-138. doi: 10.1016/j.annemergmed.2024.01.039. Epub 2024 Mar 12.
4
Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.人工智能能通过美国骨科医师学会考试吗?骨科住院医师与ChatGPT的对比。
Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.
5
Transforming healthcare documentation: harnessing the potential of AI to generate discharge summaries.变革医疗文档:利用人工智能的潜力生成出院小结。
BJGP Open. 2024 Apr 25;8(1). doi: 10.3399/BJGPO.2023.0116. Print 2024 Apr.
6
Opportunities to improve clinical summaries for patients at hospital discharge.改善患者出院时临床总结的机会。
BMJ Qual Saf. 2017 May;26(5):372-380. doi: 10.1136/bmjqs-2015-005201. Epub 2016 May 6.
7
The Development and Evaluation of a Novel Instrument Assessing Residents' Discharge Summaries.一种评估住院医师出院小结的新型工具的开发与评价
Acad Med. 2017 Apr;92(4):550-555. doi: 10.1097/ACM.0000000000001450.
8
Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format.生成式人工智能将住院病历摘要转换为患者友好型语言和格式。
JAMA Netw Open. 2024 Mar 4;7(3):e240357. doi: 10.1001/jamanetworkopen.2024.0357.
9
Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions.眼科医生与大型语言模型聊天机器人对在线患者眼部护理问题的回复比较。
JAMA Netw Open. 2023 Aug 1;6(8):e2330320. doi: 10.1001/jamanetworkopen.2023.30320.
10
Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study.ChatGPT 生成的和人工撰写的基础研究摘要在质量、正确性和相似性方面的比较:横断面研究。
J Med Internet Res. 2023 Dec 25;25:e51229. doi: 10.2196/51229.

引用本文的文献

1
Transforming hematological research documentation with large language models: an approach to scientific writing and data analysis.利用大语言模型变革血液学研究文献:一种科学写作与数据分析方法
Blood Res. 2025 Mar 6;60(1):15. doi: 10.1007/s44313-025-00062-w.
2
Benefits, limits, and risks of ChatGPT in medicine.ChatGPT在医学领域的益处、局限性及风险
Front Artif Intell. 2025 Jan 30;8:1518049. doi: 10.3389/frai.2025.1518049. eCollection 2025.