文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

评估用于初级保健的人工智能抄写员的可用性、技术性能和准确性:竞争分析

Evaluating the Usability, Technical Performance, and Accuracy of Artificial Intelligence Scribes for Primary Care: Competitive Analysis.

作者信息

Ha Emily, Choon-Kon-Yune Isabelle, Murray LaShawn, Luan Siying, Montague Enid, Bhattacharyya Onil, Agarwal Payal

机构信息

Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.

Women's College Hospital Institute for Health System Solutions and Virtual Care, Women's College Hospital, 76 Grenville Street, 6th Floor, Toronto, ON, M5S 1B2, Canada, 1 4163236400.

出版信息

JMIR Hum Factors. 2025 Jul 23;12:e71434. doi: 10.2196/71434.


DOI:10.2196/71434
PMID:40700466
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12309782/
Abstract

BACKGROUND: Primary care providers (PCPs) face significant burnout due to increasing administrative and documentation demands, contributing to job dissatisfaction and impacting care quality. Artificial intelligence (AI) scribes have emerged as potential solutions to reduce administrative burden by automating clinical documentation of patient encounters. Although AI scribes are gaining popularity in primary care, there is limited information on their usability, effectiveness, and accuracy. OBJECTIVE: This study aimed to develop and apply an evaluation framework to systematically assess the usability, technical performance, and accuracy of various AI scribes used in primary care settings across Canada and the United States. METHODS: We conducted a systematic comparison of a suite of AI scribes using competitive analysis methods. An evaluation framework was developed using expert usability approaches and human factors engineering principles and comprises 3 domains: usability, effectiveness and technical performance, and accuracy and quality. Audio files from 4 standardized patient encounters were used to generate transcripts and SOAP (Subjective, Objective, Assessment, and Plan)-format medical notes from each AI scribe. A verbatim transcript, detailed case notes, and physician-written medical notes for each audio file served as a benchmark for comparison against the AI-generated outputs. Applicable items were rated on a 3-point Likert scale (1=poor, 2=good, 3=excellent). Additional insights were gathered from clinical experts, vendor questionnaires, and public resources to support usability, effectiveness, and quality findings. RESULTS: In total, 6 AI scribes were evaluated, with notable performance differences. Most AI scribes could be accessed via various platforms (n=4) and launched within common electronic medical records, though data exchange capabilities were limited. Nearly all AI scribes generated SOAP-format notes in approximately 1 minute for a 15-minute standardized encounter (n=5), though documentation time increased with encounter length and topic complexity. While all AI scribes produced good to excellent quality medical notes, none were consistently error-free. Common errors included deletion, omission, and SOAP structure errors. Factors such as extraneous conversations and multiple speakers impacted the accuracy of both the transcript and medical note, with some AI scribes producing excellent notes despite minor transcript issues and vice versa. Limitations in usability, technical performance, and accuracy suggest areas for improvement to fully realize AI scribes' potential in reducing administrative burden for PCPs. CONCLUSIONS: This study offers one of the first systematic evaluations of the usability, effectiveness, and accuracy of a suite of AI scribes currently used in primary care, providing benchmark data for further research, policy, and practice. While AI scribes show promise in reducing documentation burdens, improvements and ongoing evaluations are essential to ensure safe and effective use. Future studies should assess AI scribe performance in real-world settings across diverse populations to support equitable and reliable applications.

摘要

背景:由于行政和文档要求不断增加,基层医疗服务提供者(PCP)面临着严重的职业倦怠,这导致工作满意度下降并影响医疗质量。人工智能(AI)书记员已成为通过自动记录患者会诊的临床文档来减轻行政负担的潜在解决方案。尽管AI书记员在基层医疗中越来越受欢迎,但关于其可用性、有效性和准确性的信息有限。 目的:本研究旨在开发并应用一个评估框架,以系统评估加拿大和美国基层医疗环境中使用的各种AI书记员的可用性、技术性能和准确性。 方法:我们使用竞争分析方法对一组AI书记员进行了系统比较。利用专家可用性方法和人因工程学原理开发了一个评估框架,该框架包括3个领域:可用性、有效性和技术性能,以及准确性和质量。来自4次标准化患者会诊的音频文件被用于生成每个AI书记员的文字记录和SOAP(主观、客观、评估和计划)格式的病历。每个音频文件的逐字记录、详细病例记录和医生书写的病历用作与AI生成的输出进行比较的基准。适用项目采用3点李克特量表进行评分(1=差,2=好,3=优秀)。从临床专家、供应商问卷和公共资源中收集了更多见解,以支持关于可用性、有效性和质量的研究结果。 结果:总共评估了6个AI书记员,它们的性能存在显著差异。大多数AI书记员可以通过各种平台访问(n=4),并在常见的电子病历中启动,不过数据交换能力有限。对于15分钟的标准化会诊(n=5),几乎所有AI书记员都能在大约1分钟内生成SOAP格式的病历,不过文档记录时间会随着会诊长度和主题复杂性的增加而延长。虽然所有AI书记员生成的病历质量都为良好到优秀,但没有一个始终无差错。常见错误包括删除、遗漏和SOAP结构错误。诸如无关对话和多个说话者等因素会影响文字记录和病历的准确性,一些AI书记员尽管文字记录存在小问题,但仍能生成优秀的病历,反之亦然。可用性、技术性能和准确性方面的局限性表明需要改进的领域,以充分发挥AI书记员在减轻PCP行政负担方面的潜力。 结论:本研究首次对目前基层医疗中使用 的一组AI书记员的可用性、有效性和准确性进行了系统评估,为进一步的研究、政策和实践提供了基准数据。虽然AI书记员在减轻文档负担方面显示出了前景,但改进和持续评估对于确保安全有效使用至关重要。未来的研究应评估AI书记员在不同人群的真实环境中的性能,以支持公平和可靠的应用。

相似文献

[1]
Evaluating the Usability, Technical Performance, and Accuracy of Artificial Intelligence Scribes for Primary Care: Competitive Analysis.

JMIR Hum Factors. 2025-7-23

[2]
Evaluating the performance of artificial intelligence-based speech recognition for clinical documentation: a systematic review.

BMC Med Inform Decis Mak. 2025-7-1

[3]
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006-9

[4]
Assessing the Efficacy and Clinical Utility of Artificial Intelligence Scribes in Urology.

Urology. 2025-2

[5]
The Impact of AI Scribes on Streamlining Clinical Documentation: A Systematic Review.

Healthcare (Basel). 2025-6-16

[6]
A Randomized-Clinical Trial of Two Ambient Artificial Intelligence Scribes: Measuring Documentation Efficiency and Physician Burnout.

medRxiv. 2025-7-11

[7]
Interventions to improve safe and effective medicines use by consumers: an overview of systematic reviews.

Cochrane Database Syst Rev. 2014-4-29

[8]
Home treatment for mental health problems: a systematic review.

Health Technol Assess. 2001

[9]
Special Topic on Burnout: Clinical Implementation of Artificial Intelligence Scribes in Healthcare: A Systematic Review.

Appl Clin Inform. 2025-4-30

[10]
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022-5-20

本文引用的文献

[1]
Impact of a Digital Scribe System on Clinical Documentation Time and Quality: Usability Study.

JMIR AI. 2024-9-23

[2]
Artificial intelligence scribes in primary care.

CMAJ. 2024-9-15

[3]
Using ChatGPT-4 to Create Structured Medical Notes From Audio Recordings of Physician-Patient Encounters: Comparative Study.

J Med Internet Res. 2024-4-22

[4]
Artificial intelligence-driven digital scribes in clinical documentation: Pilot study assessing the impact on dermatologist workflow and patient encounters.

JAAD Int. 2024-2-20

[5]
Long-term trends in the work hours of physicians in Canada.

CMAJ. 2024-3-24

[6]
The impact of nuance DAX ambient listening AI documentation: a cohort study.

J Am Med Inform Assoc. 2024-4-3

[7]
Evaluating large language models on medical evidence summarization.

NPJ Digit Med. 2023-8-24

[8]
The global effect of digital health technologies on health workers' competencies and health workplace: an umbrella review of systematic reviews and lexical-based and sentence-based meta-analysis.

Lancet Digit Health. 2023-8

[9]
The imperative for regulatory oversight of large language models (or generative AI) in healthcare.

NPJ Digit Med. 2023-7-6

[10]
Automatic speech recognition performance for digital scribes: a performance comparison between general-purpose and specialized models tuned for patient-clinician conversations.

AMIA Annu Symp Proc. 2022

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索