评估人工智能结肠活检筛查工具所产生错误的病理及临床意义。

Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.

作者信息

Evans Harriet, Sivakumar Naveen, Bhanderi Shivam, Graham Simon, Snead David, Patel Abhilasha, Robinson Andrew

机构信息

University of Warwick, Coventry, UK

Histopathology, University Hospitals Coventry and Warwickshire NHS Trust, Coventry, UK.

出版信息

BMJ Open Gastroenterol. 2025 Jan 6;12(1):e001649. doi: 10.1136/bmjgast-2024-001649.

DOI:10.1136/bmjgast-2024-001649

PMID:39762071

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11749196/

Abstract

OBJECTIVE

Artificial intelligence (AI) tools for histological diagnosis offer great potential to healthcare, yet failure to understand their clinical context is delaying adoption. IGUANA (Interpretable Gland-Graphs using a Neural Aggregator) is an AI algorithm that can effectively classify colonic biopsies into normal versus abnormal categories, designed to automatically report normal cases. We performed a retrospective pathological and clinical review of the errors made by IGUANA.

METHODS

False negative (FN) errors were the primary focus due to the greatest propensity for harm. Pathological evaluation involved assessment of whole slide image (WSI) quality, precise diagnoses for each missed entity and identification of factors impeding diagnosis. Clinical evaluation scored the impact of each error on the patient and detailed the type of impact in terms of missed diagnosis, investigations or treatment.

RESULTS

Across 5054 WSIs from 2080 UK National Health Service patients there were 220 FN errors across 164 cases (4.4% of WSI, 7.9% of cases). Diagnostic errors varied from cases of adenocarcinoma to mild inflammation. 88.4% of FN errors would have no impact on patient care, with only one error causing major patient harm. Factors that protected against harm included biopsies being low-risk polyps or diagnostic features were detected in other biopsies.

CONCLUSION

Most FN errors would not result in patient harm, suggesting that even with a 7.9% case-level error rate, this AI tool might be more suitable for adoption than statistics portray. Consideration of the clinical context of AI tool errors is essential to facilitate safe implementation.

摘要

目的

用于组织学诊断的人工智能（AI）工具为医疗保健带来了巨大潜力，但由于未能理解其临床背景，其应用受到了延迟。IGUANA（使用神经聚合器的可解释腺体图）是一种AI算法，能够有效地将结肠活检分类为正常与异常类别，旨在自动报告正常病例。我们对IGUANA所犯错误进行了回顾性病理和临床审查。

方法

由于假阴性（FN）错误造成伤害的可能性最大，因此将其作为主要关注点。病理评估包括对全切片图像（WSI）质量的评估、对每个漏诊实体的精确诊断以及对阻碍诊断因素的识别。临床评估对每个错误对患者的影响进行评分，并详细说明在漏诊、检查或治疗方面的影响类型。

结果

在来自2080名英国国民健康服务患者的5054张WSI中，164例出现了220例FN错误（占WSI的4.4%，占病例的7.9%）。诊断错误从腺癌病例到轻度炎症不等。88.4%的FN错误对患者护理没有影响，只有一例错误对患者造成了重大伤害。防止伤害的因素包括活检为低风险息肉或在其他活检中检测到诊断特征。

结论

大多数FN错误不会对患者造成伤害，这表明即使病例级错误率为7.9%，该AI工具可能比统计数据显示的更适合采用。考虑AI工具错误的临床背景对于促进安全实施至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51eb/11749196/991ad623da30/bmjgast-12-1-g001.jpg

相似文献

Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.评估人工智能结肠活检筛查工具所产生错误的病理及临床意义。

BMJ Open Gastroenterol. 2025 Jan 6;12(1):e001649. doi: 10.1136/bmjgast-2024-001649.

Development and validation of artificial intelligence-based prescreening of large-bowel biopsies taken in the UK and Portugal: a retrospective cohort study.基于人工智能的英国和葡萄牙大结肠活检预筛查的开发和验证：一项回顾性队列研究。

Lancet Digit Health. 2023 Nov;5(11):e786-e797. doi: 10.1016/S2589-7500(23)00148-6.

Frequency and characteristics of errors by artificial intelligence (AI) in reading screening mammography: a systematic review.人工智能（AI）在阅读筛查性乳房 X 光摄影中出现错误的频率和特征：系统评价。

Breast Cancer Res Treat. 2024 Aug;207(1):1-13. doi: 10.1007/s10549-024-07353-3. Epub 2024 Jun 9.

Screening of normal endoscopic large bowel biopsies with interpretable graph learning: a retrospective study.利用可解释图学习筛选常规内镜大肠活检：一项回顾性研究。

Gut. 2023 Sep;72(9):1709-1721. doi: 10.1136/gutjnl-2023-329512. Epub 2023 May 12.

Artificial intelligence-assisted double reading of chest radiographs to detect clinically relevant missed findings: a two-centre evaluation.人工智能辅助的胸部 X 光片双读以检测临床相关的遗漏发现：一项两中心评估。

Eur Radiol. 2024 Sep;34(9):5876-5885. doi: 10.1007/s00330-024-10676-w. Epub 2024 Mar 11.

Colon capsule endoscopy versus CT colonography after incomplete colonoscopy. Application of artificial intelligence algorithms to identify complete colonic investigations.结肠镜检查不完全后的结肠胶囊内镜与 CT 结肠成像术。人工智能算法在识别完整结肠检查中的应用。

United European Gastroenterol J. 2020 Aug;8(7):782-789. doi: 10.1177/2050640620937593.

Impact of Artificial Intelligence on Miss Rate of Colorectal Neoplasia.人工智能对结直肠肿瘤漏诊率的影响。

Gastroenterology. 2022 Jul;163(1):295-304.e5. doi: 10.1053/j.gastro.2022.03.007. Epub 2022 Mar 15.

Why do errors arise in artificial intelligence diagnostic tools in histopathology and how can we minimize them?人工智能在组织病理学诊断工具中出现错误的原因是什么，我们如何将其最小化？

Histopathology. 2024 Jan;84(2):279-287. doi: 10.1111/his.15071. Epub 2023 Nov 3.

Computer-aided automated diminutive colonic polyp detection in colonoscopy by using deep machine learning system; first indigenous algorithm developed in India.利用深度学习系统辅助结肠镜下微小结肠息肉检测的计算机自动化技术；印度首创的本土算法。

Indian J Gastroenterol. 2023 Apr;42(2):226-232. doi: 10.1007/s12664-022-01331-7. Epub 2023 May 5.

Artificial Intelligence-Based Tool for Tumor Detection and Quantitative Tissue Analysis in Colorectal Specimens.基于人工智能的结直肠组织标本肿瘤检测和定量组织分析工具。

Mod Pathol. 2023 Dec;36(12):100327. doi: 10.1016/j.modpat.2023.100327. Epub 2023 Sep 6.

本文引用的文献

Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy.数字病理学中的人工智能：诊断测试准确性的系统评价与荟萃分析

NPJ Digit Med. 2024 May 4;7(1):114. doi: 10.1038/s41746-024-01106-8.

Understanding the errors made by artificial intelligence algorithms in histopathology in terms of patient impact.从对患者的影响角度理解人工智能算法在组织病理学中所犯的错误。

NPJ Digit Med. 2024 Apr 10;7(1):89. doi: 10.1038/s41746-024-01093-w.

Screening of normal endoscopic large bowel biopsies with interpretable graph learning: a retrospective study.利用可解释图学习筛选常规内镜大肠活检：一项回顾性研究。

Gut. 2023 Sep;72(9):1709-1721. doi: 10.1136/gutjnl-2023-329512. Epub 2023 May 12.

Mapping the Landscape of Care Providers' Quality Assurance Approaches for AI in Diagnostic Imaging.绘制诊断成像中人工智能质量保证方法的护理提供者图谱。

J Digit Imaging. 2023 Apr;36(2):379-387. doi: 10.1007/s10278-022-00731-7. Epub 2022 Nov 9.

Bridging the chasm between AI and clinical implementation.弥合人工智能与临床应用之间的差距。

Lancet. 2022 Feb 12;399(10325):620. doi: 10.1016/S0140-6736(22)00235-5.

AI in health and medicine.人工智能在医疗中的应用。

Nat Med. 2022 Jan;28(1):31-38. doi: 10.1038/s41591-021-01614-0. Epub 2022 Jan 20.

Evaluation framework to guide implementation of AI systems into healthcare settings.指导将人工智能系统引入医疗保健环境的实施的评估框架。

BMJ Health Care Inform. 2021 Oct;28(1). doi: 10.1136/bmjhci-2021-100444.

Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: the STARD-AI protocol.制定以人工智能为中心的诊断性试验准确性研究报告规范：STARD-AI 协议。

BMJ Open. 2021 Jun 28;11(6):e047709. doi: 10.1136/bmjopen-2020-047709.

Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension.涉及人工智能干预的临床试验方案指南：SPIRIT-AI 扩展。

Lancet Digit Health. 2020 Oct;2(10):e549-e560. doi: 10.1016/S2589-7500(20)30219-3. Epub 2020 Sep 9.

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging.隐藏分层导致医学成像机器学习中具有临床意义的失败。

Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:151-159. doi: 10.1145/3368555.3384468.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

评估人工智能结肠活检筛查工具所产生错误的病理及临床意义。

Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献