文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

测量人工智能在住院患者诊断中的影响:一项随机临床病例调查研究。

Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study.

机构信息

Computer Science and Engineering, University of Michigan, Ann Arbor.

Now with Computer Science Courant Institute, New York University, New York.

出版信息

JAMA. 2023 Dec 19;330(23):2275-2284. doi: 10.1001/jama.2023.22295.


DOI:10.1001/jama.2023.22295
PMID:38112814
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10731487/
Abstract

IMPORTANCE: Artificial intelligence (AI) could support clinicians when diagnosing hospitalized patients; however, systematic bias in AI models could worsen clinician diagnostic accuracy. Recent regulatory guidance has called for AI models to include explanations to mitigate errors made by models, but the effectiveness of this strategy has not been established. OBJECTIVES: To evaluate the impact of systematically biased AI on clinician diagnostic accuracy and to determine if image-based AI model explanations can mitigate model errors. DESIGN, SETTING, AND PARTICIPANTS: Randomized clinical vignette survey study administered between April 2022 and January 2023 across 13 US states involving hospitalist physicians, nurse practitioners, and physician assistants. INTERVENTIONS: Clinicians were shown 9 clinical vignettes of patients hospitalized with acute respiratory failure, including their presenting symptoms, physical examination, laboratory results, and chest radiographs. Clinicians were then asked to determine the likelihood of pneumonia, heart failure, or chronic obstructive pulmonary disease as the underlying cause(s) of each patient's acute respiratory failure. To establish baseline diagnostic accuracy, clinicians were shown 2 vignettes without AI model input. Clinicians were then randomized to see 6 vignettes with AI model input with or without AI model explanations. Among these 6 vignettes, 3 vignettes included standard-model predictions, and 3 vignettes included systematically biased model predictions. MAIN OUTCOMES AND MEASURES: Clinician diagnostic accuracy for pneumonia, heart failure, and chronic obstructive pulmonary disease. RESULTS: Median participant age was 34 years (IQR, 31-39) and 241 (57.7%) were female. Four hundred fifty-seven clinicians were randomized and completed at least 1 vignette, with 231 randomized to AI model predictions without explanations, and 226 randomized to AI model predictions with explanations. Clinicians' baseline diagnostic accuracy was 73.0% (95% CI, 68.3% to 77.8%) for the 3 diagnoses. When shown a standard AI model without explanations, clinician accuracy increased over baseline by 2.9 percentage points (95% CI, 0.5 to 5.2) and by 4.4 percentage points (95% CI, 2.0 to 6.9) when clinicians were also shown AI model explanations. Systematically biased AI model predictions decreased clinician accuracy by 11.3 percentage points (95% CI, 7.2 to 15.5) compared with baseline and providing biased AI model predictions with explanations decreased clinician accuracy by 9.1 percentage points (95% CI, 4.9 to 13.2) compared with baseline, representing a nonsignificant improvement of 2.3 percentage points (95% CI, -2.7 to 7.2) compared with the systematically biased AI model. CONCLUSIONS AND RELEVANCE: Although standard AI models improve diagnostic accuracy, systematically biased AI models reduced diagnostic accuracy, and commonly used image-based AI model explanations did not mitigate this harmful effect. TRIAL REGISTRATION: ClinicalTrials.gov Identifier: NCT06098950.

摘要

重要性:人工智能(AI)可以在诊断住院患者时为临床医生提供支持;然而,AI 模型中的系统偏差可能会降低临床医生的诊断准确性。最近的监管指南要求 AI 模型包括解释,以减轻模型错误,但这一策略的有效性尚未得到证实。

目的:评估系统偏差 AI 对临床医生诊断准确性的影响,并确定基于图像的 AI 模型解释是否可以减轻模型错误。

设计、设置和参与者:这是一项在美国 13 个州进行的随机临床病例调查研究,于 2022 年 4 月至 2023 年 1 月期间进行,涉及住院医师、护士执业医师和医师助理。

干预措施:临床医生观看了 9 个患有急性呼吸衰竭的住院患者的临床病例,包括他们的症状、体检、实验室结果和胸部 X 光片。然后,临床医生被要求确定每个患者急性呼吸衰竭的潜在病因(肺炎、心力衰竭或慢性阻塞性肺疾病)。为了建立基线诊断准确性,临床医生观看了 2 个没有 AI 模型输入的病例。然后,临床医生被随机分配观看 6 个有或没有 AI 模型解释的 AI 模型输入病例。在这 6 个病例中,有 3 个病例包含标准模型预测,3 个病例包含系统偏差模型预测。

主要结果和测量:肺炎、心力衰竭和慢性阻塞性肺疾病的临床医生诊断准确性。

结果:中位参与者年龄为 34 岁(IQR,31-39),241 名(57.7%)为女性。共有 457 名临床医生被随机分配并完成了至少 1 个病例,其中 231 名被分配到 AI 模型预测无解释,226 名被分配到 AI 模型预测有解释。临床医生的基线诊断准确性为 73.0%(95%CI,68.3%至 77.8%),用于 3 种诊断。当展示标准 AI 模型而没有解释时,临床医生的准确性相对于基线提高了 2.9 个百分点(95%CI,0.5 至 5.2),当临床医生还观看了 AI 模型解释时,准确性提高了 4.4 个百分点(95%CI,2.0 至 6.9)。与基线相比,系统偏差 AI 模型预测降低了 11.3 个百分点(95%CI,7.2 至 15.5),提供偏差 AI 模型预测和解释降低了 9.1 个百分点(95%CI,4.9 至 13.2),与基线相比,这代表了 2.3 个百分点(95%CI,-2.7 至 7.2)的非显著改善。

结论和相关性:尽管标准 AI 模型提高了诊断准确性,但系统偏差 AI 模型降低了诊断准确性,常用的基于图像的 AI 模型解释并没有减轻这种有害影响。

试验注册:ClinicalTrials.gov 标识符:NCT06098950。

相似文献

[1]
Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study.

JAMA. 2023-12-19

[2]
Deep Learning Assistance Closes the Accuracy Gap in Fracture Detection Across Clinician Types.

Clin Orthop Relat Res. 2023-3-1

[3]
Development and Assessment of an Artificial Intelligence-Based Tool for Skin Condition Diagnosis by Primary Care Physicians and Nurse Practitioners in Teledermatology Practices.

JAMA Netw Open. 2021-4-1

[4]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[5]
Interaction between clinicians and artificial intelligence to detect fetal atrioventricular septal defects on ultrasound: how can we optimize collaborative performance?

Ultrasound Obstet Gynecol. 2024-7

[6]
Lung Ultrasound for the Emergency Diagnosis of Pneumonia, Acute Heart Failure, and Exacerbations of Chronic Obstructive Pulmonary Disease/Asthma in Adults: A Systematic Review and Meta-analysis.

J Emerg Med. 2019-1

[7]
Artificial intelligence suppression as a strategy to mitigate artificial intelligence automation bias.

J Am Med Inform Assoc. 2023-9-25

[8]
Evaluation of the impact of artificial intelligence-assisted image interpretation on the diagnostic performance of clinicians in identifying pneumothoraces on plain chest X-ray: a multi-case multi-reader study.

Emerg Med J. 2024-9-25

[9]
Efficacy of Artificial-Intelligence-Driven Differential-Diagnosis List on the Diagnostic Accuracy of Physicians: An Open-Label Randomized Controlled Study.

Int J Environ Res Public Health. 2021-2-21

[10]
Preferences for Artificial Intelligence Clinicians Before and During the COVID-19 Pandemic: Discrete Choice Experiment and Propensity Score Matching Study.

J Med Internet Res. 2021-3-2

引用本文的文献

[1]
The algorithmic consultant: a new era of clinical AI calls for a new workforce of physician-algorithm specialists.

NPJ Digit Med. 2025-8-27

[2]
From lab to life: technological innovations in transforming cancer metastasis detection and therapy.

Discov Oncol. 2025-8-10

[3]
Auditor Models to Suppress Poor AI Predictions Can Improve Human-AI Collaborative Performance.

medRxiv. 2025-6-24

[4]
An Artificial Intelligence Pipeline for Hepatocellular Carcinoma: From Data to Treatment Recommendations.

Int J Gen Med. 2025-7-2

[5]
The Impact of Machine Learning Mortality Risk Prediction on Clinician Prognostic Accuracy and Decision Support: A Randomized Vignette Study.

Med Decis Making. 2025-7-4

[6]
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks.

Comput Vis ECCV. 2025

[7]
Triage-HF Validation in Heart Failure Clinical Practice: Importance of Episode Duration.

Diagnostics (Basel). 2025-6-10

[8]
Empirically derived evaluation requirements for responsible deployments of AI in safety-critical settings.

NPJ Digit Med. 2025-6-18

[9]
A scoping review and evidence gap analysis of clinical AI fairness.

NPJ Digit Med. 2025-6-14

[10]
Artificial intelligence for age-related macular degeneration diagnosis in Australia: A Novel Qualitative Interview Study.

Ophthalmic Physiol Opt. 2025-9

本文引用的文献

[1]
Prevention of Bias and Discrimination in Clinical Practice Algorithms.

JAMA. 2023-1-24

[2]
Teaching artificial intelligence as a fundamental toolset of medicine.

Cell Rep Med. 2022-12-20

[3]
Practice Trends and Characteristics of US Hospitalists From 2012 to 2018.

JAMA Health Forum. 2021-11

[4]
AI recognition of patient race in medical imaging: a modelling study.

Lancet Digit Health. 2022-6

[5]
Combining chest X-rays and electronic health record (EHR) data using machine learning to diagnose acute respiratory failure.

J Am Med Inform Assoc. 2022-5-11

[6]
Deep learning in histopathology: the path to the clinic.

Nat Med. 2021-5

[7]
Do as AI say: susceptibility in deployment of clinical decision-aids.

NPJ Digit Med. 2021-2-19

[8]
The Epidemiology of Respiratory Failure in the United States 2002-2017: A Serial Cross-Sectional Study.

Crit Care Explor. 2020-6-10

[9]
Human-computer collaboration for skin cancer recognition.

Nat Med. 2020-6-22

[10]
Presenting machine learning model information to clinical end users with model facts labels.

NPJ Digit Med. 2020-3-23

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索