通过互补性驱动的延迟来提高人工智能辅助诊断的可靠性和准确性。

Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians.

机构信息

Google DeepMind, Mountain View, CA, USA.

Google Research, New York, NY, USA.

出版信息

Nat Med. 2023 Jul;29(7):1814-1820. doi: 10.1038/s41591-023-02437-x. Epub 2023 Jul 17.

DOI:10.1038/s41591-023-02437-x

PMID:37460754

Abstract

Predictive artificial intelligence (AI) systems based on deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings, but can make errors in cases accurately diagnosed by clinicians and vice versa. We developed Complementarity-Driven Deferral to Clinical Workflow (CoDoC), a system that can learn to decide between the opinion of a predictive AI model and a clinical workflow. CoDoC enhances accuracy relative to clinician-only or AI-only baselines in clinical workflows that screen for breast cancer or tuberculosis (TB). For breast cancer screening, compared to double reading with arbitration in a screening program in the UK, CoDoC reduced false positives by 25% at the same false-negative rate, while achieving a 66% reduction in clinician workload. For TB triaging, compared to standalone AI and clinical workflows, CoDoC achieved a 5-15% reduction in false positives at the same false-negative rate for three of five commercially available predictive AI systems. To facilitate the deployment of CoDoC in novel futuristic clinical settings, we present results showing that CoDoC's performance gains are sustained across several axes of variation (imaging modality, clinical setting and predictive AI system) and discuss the limitations of our evaluation and where further validation would be needed. We provide an open-source implementation to encourage further research and application.

摘要

基于深度学习的预测人工智能 (AI) 系统已被证明能够在多种医学成像环境中达到专家级别的疾病识别水平，但在临床医生准确诊断的病例中可能会出错，反之亦然。我们开发了互补驱动的临床工作流程延迟 (CoDoC)，这是一种可以学习在预测 AI 模型的意见和临床工作流程之间做出决策的系统。在用于筛查乳腺癌或结核病 (TB) 的临床工作流程中，CoDoC 相对于仅临床医生或仅 AI 的基线提高了准确性。对于乳腺癌筛查，与英国筛查计划中的双重阅读和仲裁相比，CoDoC 在相同的假阴性率下将假阳性减少了 25%，同时将临床医生的工作量减少了 66%。对于 TB 分诊，与独立的 AI 和临床工作流程相比，CoDoC 在相同的假阴性率下，对于五个商业上可用的预测 AI 系统中的三个，假阳性减少了 5-15%。为了促进 CoDoC 在新颖的未来临床环境中的部署，我们展示了结果，表明 CoDoC 的性能增益在几个变化轴（成像方式、临床环境和预测 AI 系统）上保持不变，并讨论了我们的评估的局限性以及需要进一步验证的地方。我们提供了一个开源实现，以鼓励进一步的研究和应用。

相似文献

Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians.通过互补性驱动的延迟来提高人工智能辅助诊断的可靠性和准确性。

Nat Med. 2023 Jul;29(7):1814-1820. doi: 10.1038/s41591-023-02437-x. Epub 2023 Jul 17.

AI-based Strategies to Reduce Workload in Breast Cancer Screening with Mammography and Tomosynthesis: A Retrospective Evaluation.基于人工智能的策略可减少乳腺癌筛查中乳腺 X 线摄影和断层合成的工作量：回顾性评估。

Radiology. 2021 Jul;300(1):57-65. doi: 10.1148/radiol.2021203555. Epub 2021 May 4.

Using deep learning to assist readers during the arbitration process: a lesion-based retrospective evaluation of breast cancer screening performance.使用深度学习辅助读片者进行仲裁：基于病变的乳腺癌筛查性能回顾性评估。

Eur Radiol. 2022 Feb;32(2):842-852. doi: 10.1007/s00330-021-08217-w. Epub 2021 Aug 12.

Breast cancer screening with digital breast tomosynthesis: comparison of different reading strategies implementing artificial intelligence.数字乳腺断层合成筛查乳腺癌：不同人工智能阅读策略的比较。

Eur Radiol. 2023 May;33(5):3754-3765. doi: 10.1007/s00330-022-09316-y. Epub 2022 Dec 11.

Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection.人工智能（AI）在乳腺癌筛查中的应用：基于乳腺筛查人群队列的癌症检测研究。

EBioMedicine. 2023 Apr;90:104498. doi: 10.1016/j.ebiom.2023.104498. Epub 2023 Feb 28.

Augmenting lung cancer diagnosis on chest radiographs: positioning artificial intelligence to improve radiologist performance.增强胸部 X 光片肺癌诊断：将人工智能定位以提高放射科医生的表现。

Clin Radiol. 2021 Aug;76(8):607-614. doi: 10.1016/j.crad.2021.03.021. Epub 2021 May 11.

Effect of artificial intelligence-based triaging of breast cancer screening mammograms on cancer detection and radiologist workload: a retrospective simulation study.基于人工智能的乳腺癌筛查钼靶图像分诊对癌症检出率和放射科医生工作量的影响：一项回顾性模拟研究。

Lancet Digit Health. 2020 Sep;2(9):e468-e474. doi: 10.1016/S2589-7500(20)30185-0.

Deep learning-based triage and analysis of lesion burden for COVID-19: a retrospective study with external validation.基于深度学习的 COVID-19 病变负担分类和分析：一项具有外部验证的回顾性研究。

Lancet Digit Health. 2020 Oct;2(10):e506-e515. doi: 10.1016/S2589-7500(20)30199-0. Epub 2020 Sep 22.

Comparison of Chest Radiograph Interpretations by Artificial Intelligence Algorithm vs Radiology Residents.人工智能算法与放射科住院医师对胸部 X 线片解读的比较。

JAMA Netw Open. 2020 Oct 1;3(10):e2022779. doi: 10.1001/jamanetworkopen.2020.22779.

Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy.人工智能在乳腺癌筛查计划中的图像分析应用：测试准确性的系统评价。

BMJ. 2021 Sep 1;374:n1872. doi: 10.1136/bmj.n1872.

引用本文的文献

A vision of human-AI collaboration for enhanced biological collection curation and research.关于人类与人工智能合作以加强生物样本管理和研究的愿景。

Bioscience. 2025 Mar 28;75(6):457-471. doi: 10.1093/biosci/biaf021. eCollection 2025 Jun.

Adoption and perception of LLM-based chatbots in health care: an exploratory cross-sectional survey of individuals with rheumatic diseases.基于大语言模型的聊天机器人在医疗保健中的应用与认知：对风湿病患者的探索性横断面调查

Rheumatol Adv Pract. 2025 Jul 12;9(3):rkaf083. doi: 10.1093/rap/rkaf083. eCollection 2025.

Beyond Assistance: The Case for Role Separation in AI-Human Radiology Workflows.超越辅助：人工智能与人类在放射学工作流程中角色分离的理由

Radiology. 2025 Jul;316(1):e250477. doi: 10.1148/radiol.250477.

Artificial intelligence for healthcare: restrained development despite impressive applications.医疗保健领域的人工智能：尽管应用令人印象深刻，但发展受限。

Infect Dis Poverty. 2025 Jul 20;14(1):72. doi: 10.1186/s40249-025-01339-z.

Comparing the accuracy of computer-aided detection (CAD) software and radiologists from multiple countries for tuberculosis detection in chest X-Rays.比较多个国家的计算机辅助检测（CAD）软件与放射科医生在胸部X光片中检测结核病的准确性。

Sci Rep. 2025 Jul 2;15(1):22540. doi: 10.1038/s41598-025-06164-w.

Advancing Mental Health Care: A Comprehensive Review of Digital Tools and Technologies for Enhancing Diagnosis, Treatment, and Wellness.推进精神卫生保健：关于增强诊断、治疗和健康水平的数字工具与技术的全面综述

Health Care Sci. 2025 May 31;4(3):163-178. doi: 10.1002/hcs2.70018. eCollection 2025 Jun.

Foundation models and intelligent decision-making: Progress, challenges, and perspectives.基础模型与智能决策：进展、挑战与展望

Innovation (Camb). 2025 May 12;6(6):100948. doi: 10.1016/j.xinn.2025.100948. eCollection 2025 Jun 2.

Simulating workload reduction with an AI-based prostate cancer detection pathway using a prediction uncertainty metric.使用预测不确定性指标，通过基于人工智能的前列腺癌检测途径模拟工作量减少情况。

Eur Radiol. 2025 Jun 7. doi: 10.1007/s00330-025-11727-6.

Navigating the AI Revolution in Medicine-Adopting Strategies for Medical Education.引领医学领域的人工智能革命——医学教育的应对策略

Med Sci Educ. 2024 Dec 27;35(2):1055-1061. doi: 10.1007/s40670-024-02257-9. eCollection 2025 Apr.

Hierarchical diagnosis of breast phyllodes tumors enabled by deep learning of ultrasound images: a retrospective multi-center study.基于超声图像深度学习的乳腺叶状肿瘤分层诊断：一项回顾性多中心研究

Cancer Imaging. 2025 May 8;25(1):61. doi: 10.1186/s40644-025-00879-9.

本文引用的文献

Combining the strengths of radiologists and AI for breast cancer screening: a retrospective analysis.将放射科医生和人工智能的优势相结合用于乳腺癌筛查：一项回顾性分析。

Lancet Digit Health. 2022 Jul;4(7):e507-e519. doi: 10.1016/S2589-7500(22)00070-X.

BMJ. 2021 Sep 1;374:n1872. doi: 10.1136/bmj.n1872.

Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: an evaluation of five artificial intelligence algorithms.从高结核病负担环境中的胸部 X 光片中检测结核病以进行分诊：五种人工智能算法的评估。

Lancet Digit Health. 2021 Sep;3(9):e543-e554. doi: 10.1016/S2589-7500(21)00116-3.

Continual learning in medical devices: FDA's action plan and beyond.医疗器械的持续学习：美国食品药品监督管理局的行动计划及其他

Lancet Digit Health. 2021 Jun;3(6):e337-e338. doi: 10.1016/S2589-7500(21)00076-5. Epub 2021 Apr 28.

Do as AI say: susceptibility in deployment of clinical decision-aids.按照人工智能所说的去做：临床决策辅助工具部署中的易感性。

NPJ Digit Med. 2021 Feb 19;4(1):31. doi: 10.1038/s41746-021-00385-9.

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization.一种利用弱监督定位的高分辨率乳腺癌筛查图像可解释分类器。

Med Image Anal. 2021 Feb;68:101908. doi: 10.1016/j.media.2020.101908. Epub 2020 Dec 16.

Clinical applications of continual learning machine learning.持续学习机器学习的临床应用。

Lancet Digit Health. 2020 Jun;2(6):e279-e281. doi: 10.1016/S2589-7500(20)30102-3.

Evaluation of computer aided detection of tuberculosis on chest radiography among people with diabetes in Karachi Pakistan.巴基斯坦卡拉奇市糖尿病患者中胸部 X 线计算机辅助检测结核病的评估。

Sci Rep. 2020 Apr 14;10(1):6276. doi: 10.1038/s41598-020-63084-7.

International evaluation of an AI system for breast cancer screening.国际乳腺癌筛查人工智能系统评估。

Nature. 2020 Jan;577(7788):89-94. doi: 10.1038/s41586-019-1799-6. Epub 2020 Jan 1.

From development to deployment: dataset shift, causality, and shift-stable models in health AI.从开发到部署：健康人工智能中的数据集偏移、因果关系和偏移稳定模型。

Biostatistics. 2020 Apr 1;21(2):345-352. doi: 10.1093/biostatistics/kxz041.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过互补性驱动的延迟来提高人工智能辅助诊断的可靠性和准确性。

Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献