• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较人工智能撰写的与临床医生撰写的模拟初级保健电子健康记录摘要。

Comparing artificial intelligence- vs clinician-authored summaries of simulated primary care electronic health records.

作者信息

Shemtob Lara, Nouri Abdullah, Harvey-Sullivan Adam, Qiu Connor S, Martin Jonathan, Martin Martha, Noden Sara, Rob Tanveer, Neves Ana L, Majeed Azeem, Clarke Jonathan, Beaney Thomas

机构信息

Department of Primary Care and Public Health, Imperial College London, London W12 0BZ, United Kingdom.

St Andrews Health Centre, London E3 3FF, United Kingdom.

出版信息

JAMIA Open. 2025 Jul 30;8(4):ooaf082. doi: 10.1093/jamiaopen/ooaf082. eCollection 2025 Aug.

DOI:10.1093/jamiaopen/ooaf082
PMID:40741008
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12309840/
Abstract

OBJECTIVE

To compare clinical summaries generated from simulated patient primary care electronic health records (EHRs) by GPT-4, to summaries generated by clinicians on multiple domains of quality including utility, concision, accuracy, and bias.

MATERIALS AND METHODS

Seven primary care physicians generated 70 simulated patient EHR notes, each representing 10 patient contacts with the practice over at least 2 years. Each record was summarized by a different clinician and by GPT-4. artificial intelligence (AI)- and clinician-authored summaries were rated blind by clinicians according to 8 domains of quality and an overall rating.

RESULTS

The median time taken for a clinician to read through and assimilate the information in the EHRs before summarizing, was 7 minutes. Clinicians rated clinician-authored summaries higher than AI-authored summaries overall (7.39 vs 7.00 out of 10;  = .02), but with greater variability in clinician-authored summary ratings. AI and clinician-authored summaries had similar accuracy and AI-authored summaries were less likely to omit important information and more likely to use patient-friendly language.

DISCUSSION

Although AI-authored summaries were rated slightly lower overall compared with clinician-authored summaries, they demonstrated similar accuracy and greater consistency. This demonstrates potential applications for generating summaries in primary care, particularly given the substantial time taken for clinicians to undertake this work.

CONCLUSION

The results suggest the feasibility, utility and acceptability of using AI-authored summaries to integrate into EHRs to support clinicians in primary care. AI summarization tools have the potential to improve healthcare productivity, including by enabling clinicians to spend more time on direct patient care.

摘要

目的

比较GPT-4从模拟患者初级保健电子健康记录(EHR)生成的临床总结与临床医生在包括实用性、简洁性、准确性和偏差在内的多个质量领域生成的总结。

材料与方法

七名初级保健医生生成了70份模拟患者EHR记录,每份记录代表患者与该医疗机构至少两年内的10次接触。每份记录分别由不同的临床医生和GPT-4进行总结。临床医生对人工智能(AI)生成的总结和临床医生生成的总结进行盲法评分,评分依据8个质量领域和一个总体评分。

结果

临床医生在总结前通读并吸收EHR信息所需的中位时间为7分钟。总体而言,临床医生对临床医生生成的总结的评分高于AI生成的总结(10分制下分别为7.39分和7.00分;P = 0.02),但临床医生生成的总结评分的变异性更大。AI生成的总结和临床医生生成的总结准确性相似,且AI生成的总结更不容易遗漏重要信息,更有可能使用患者友好型语言。

讨论

尽管与临床医生生成的总结相比,AI生成的总结总体评分略低,但它们显示出相似的准确性和更高的一致性。这表明在初级保健中生成总结具有潜在应用,特别是考虑到临床医生开展这项工作需要大量时间。

结论

结果表明使用AI生成的总结整合到EHR中以支持初级保健临床医生的可行性、实用性和可接受性。AI总结工具有可能提高医疗保健生产力,包括使临床医生能够将更多时间用于直接的患者护理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/786cd5add260/ooaf082f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/81b553192cc9/ooaf082f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/c356ef494e54/ooaf082f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/4bc5a1f960f5/ooaf082f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/fe2b8d51e1d1/ooaf082f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/786cd5add260/ooaf082f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/81b553192cc9/ooaf082f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/c356ef494e54/ooaf082f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/4bc5a1f960f5/ooaf082f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/fe2b8d51e1d1/ooaf082f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad73/12309840/786cd5add260/ooaf082f5.jpg

相似文献

1
Comparing artificial intelligence- vs clinician-authored summaries of simulated primary care electronic health records.比较人工智能撰写的与临床医生撰写的模拟初级保健电子健康记录摘要。
JAMIA Open. 2025 Jul 30;8(4):ooaf082. doi: 10.1093/jamiaopen/ooaf082. eCollection 2025 Aug.
2
Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation.通过在出院小结中添加重点内容提高大语言模型的总结准确性:比较评估
JMIR Med Inform. 2025 Jul 24;13:e66476. doi: 10.2196/66476.
3
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Evaluating Large Language Models for Drafting Emergency Department Discharge Summaries.评估用于起草急诊科出院小结的大语言模型。
medRxiv. 2024 Apr 4:2024.04.03.24305088. doi: 10.1101/2024.04.03.24305088.
6
Utility of Generative Artificial Intelligence for Japanese Medical Interview Training: Randomized Crossover Pilot Study.生成式人工智能在日本医学面试培训中的效用:随机交叉试点研究。
JMIR Med Educ. 2025 Aug 1;11:e77332. doi: 10.2196/77332.
7
The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.生成式预训练变换器4(GPT-4)分析三种不同语言医学笔记的潜力:一项回顾性模型评估研究。
Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2.
8
Artificial Intelligence to Improve Clinical Coding Practice in Scandinavia: Crossover Randomized Controlled Trial.人工智能改善斯堪的纳维亚地区临床编码实践:交叉随机对照试验。
J Med Internet Res. 2025 Jul 3;27:e71904. doi: 10.2196/71904.
9
AI Scribes in Health Care: Balancing Transformative Potential With Responsible Integration.医疗保健领域的人工智能抄写员:平衡变革潜力与负责任的整合
JMIR Med Inform. 2025 Aug 1;13:e80898. doi: 10.2196/80898.
10
The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study.GPT-3 人工智能模型的诊断和分诊准确性:一项观察性研究。
Lancet Digit Health. 2024 Aug;6(8):e555-e561. doi: 10.1016/S2589-7500(24)00097-9.

本文引用的文献

1
A comparative study of recent large language models on generating hospital discharge summaries for lung cancer patients.近期大型语言模型在生成肺癌患者出院小结方面的比较研究。
J Biomed Inform. 2025 Aug;168:104867. doi: 10.1016/j.jbi.2025.104867. Epub 2025 Jun 20.
2
VaxBot-HPV: a GPT-based chatbot for answering HPV vaccine-related questions.VaxBot-HPV:一款基于GPT的聊天机器人,用于回答与HPV疫苗相关的问题。
JAMIA Open. 2025 Feb 19;8(1):ooaf005. doi: 10.1093/jamiaopen/ooaf005. eCollection 2025 Feb.
3
Toward expert-level medical question answering with large language models.
迈向使用大语言模型实现专家级医学问答
Nat Med. 2025 Mar;31(3):943-950. doi: 10.1038/s41591-024-03423-7. Epub 2025 Jan 8.
4
Expert evaluation of large language models for clinical dialogue summarization.用于临床对话总结的大语言模型的专家评估。
Sci Rep. 2025 Jan 7;15(1):1195. doi: 10.1038/s41598-024-84850-x.
5
Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review.ChatGPT 及其他会话型大型语言模型在医疗保健中的应用及关注:系统评价。
J Med Internet Res. 2024 Nov 7;26:e22769. doi: 10.2196/22769.
6
Applying generative AI with retrieval augmented generation to summarize and extract key clinical information from electronic health records.运用生成式人工智能与检索增强生成相结合,从电子健康记录中总结和提取关键临床信息。
J Biomed Inform. 2024 Aug;156:104662. doi: 10.1016/j.jbi.2024.104662. Epub 2024 Jun 14.
7
RefAI: a GPT-powered retrieval-augmented generative tool for biomedical literature recommendation and summarization.RefAI:一个基于 GPT 的检索增强型生成工具,用于生物医学文献推荐和总结。
J Am Med Inform Assoc. 2024 Sep 1;31(9):2030-2039. doi: 10.1093/jamia/ocae129.
8
Evaluation of large language models performance against humans for summarizing MRI knee radiology reports: A feasibility study.评估大语言模型在总结 MRI 膝关节影像学报告方面的表现与人类相比的性能:一项可行性研究。
Int J Med Inform. 2024 Jul;187:105443. doi: 10.1016/j.ijmedinf.2024.105443. Epub 2024 Apr 4.
9
Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format.生成式人工智能将住院病历摘要转换为患者友好型语言和格式。
JAMA Netw Open. 2024 Mar 4;7(3):e240357. doi: 10.1001/jamanetworkopen.2024.0357.
10
Bridging the equity gap towards inclusive artificial intelligence in healthcare diagnostics.弥合医疗诊断中包容性人工智能的公平差距。
BMJ. 2024 Feb 29;384:q490. doi: 10.1136/bmj.q490.