• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用GPT-4进行病理报告中的自动错误检测:对肿瘤诊断的影响。

Harnessing GPT-4 for automated error detection in pathology reports: Implications for oncology diagnostics.

作者信息

Yang Xiongwen, Zhang Yun, Jiang Jinyan, Chen Zhijun, Bai Rinasu, Yuan Zihao, Dong Longyan, Xiao Yi, Liu Di, Deng Huiyin, Huang Jian, Shi Huiyou, Liu Dan, Liang Maoli, Tang WeiJuan, Xu Chuan

机构信息

Department of Thoracic Surgery, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China.

NHC Key Laboratory of Pulmonary Immunological Diseases, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China.

出版信息

Digit Health. 2025 May 29;11:20552076251346703. doi: 10.1177/20552076251346703. eCollection 2025 Jan-Dec.

DOI:10.1177/20552076251346703
PMID:40453047
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12123116/
Abstract

OBJECTIVE

Accurate pathology reports are crucial for the diagnosis and treatment planning of cancer patients. However, these reports are prone to errors due to time pressures, subjective interpretation, and inconsistencies among professionals. Addressing these errors is vital for improving oncology care outcomes. Artificial intelligence (AI) systems, such as GPT-4, offer the potential to enhance diagnostic accuracy and efficiency.

METHODS

A total of 700 malignant tumor pathology reports were collected from four hospitals. Of these, 350 reports had deliberate errors introduced by a senior pathologist, mimicking real-world reporting challenges. Error detection performance was evaluated by comparing GPT-4 to six human pathologists (two seniors, two attending pathologists, and two residents). Key metrics included error detection rates with Wilson confidence intervals and processing time per report.

RESULTS

GPT-4 detected 88% of errors (350/400; 95% CI: [84, 91]), compared to a 95% detection rate by the top senior pathologist (382/400; 95% CI: [93, 97]). GPT-4 significantly reduced the average processing time to 4.03 seconds per report, compared to 65.64 seconds for the fastest human pathologist. However, GPT-4 exhibited a higher rate of false positives (2.3%; 95% CI: [1.52, 3.01]) compared to the best-performing senior pathologist (0.3%; 95% CI: [0.01, 0.91]).

CONCLUSIONS

GPT-4 demonstrates substantial potential in improving the efficiency and accuracy of pathology error detection, which could accelerate clinical workflows and enhance cancer diagnostics. However, its higher false-positive rate emphasizes the need for human oversight to ensure safe implementation in clinical practice.

摘要

目的

准确的病理报告对于癌症患者的诊断和治疗规划至关重要。然而,由于时间压力、主观解读以及专业人员之间的不一致性,这些报告容易出现错误。解决这些错误对于改善肿瘤护理结果至关重要。人工智能(AI)系统,如GPT-4,具有提高诊断准确性和效率的潜力。

方法

从四家医院收集了总共700份恶性肿瘤病理报告。其中,350份报告由一位资深病理学家故意引入错误,以模拟现实世界中的报告挑战。通过将GPT-4与六位人类病理学家(两位资深病理学家、两位主治病理学家和两位住院医师)进行比较,评估错误检测性能。关键指标包括带有威尔逊置信区间的错误检测率和每份报告的处理时间。

结果

GPT-4检测到88%的错误(350/400;95%置信区间:[84, 91]),而顶级资深病理学家的检测率为95%(382/400;95%置信区间:[93, 97])。与最快的人类病理学家每份报告65.64秒相比,GPT-4显著将平均处理时间减少到每份报告4.03秒。然而,与表现最佳的资深病理学家(0.3%;95%置信区间:[0.01, 0.91])相比,GPT-4的假阳性率更高(2.3%;95%置信区间:[1.52, 3.01])。

结论

GPT-4在提高病理错误检测的效率和准确性方面显示出巨大潜力,这可以加速临床工作流程并增强癌症诊断。然而,其较高的假阳性率强调了需要人工监督以确保在临床实践中的安全实施。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/84858cb2e0cc/10.1177_20552076251346703-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/69f8f4624cb3/10.1177_20552076251346703-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/759bae0cc4da/10.1177_20552076251346703-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/84858cb2e0cc/10.1177_20552076251346703-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/69f8f4624cb3/10.1177_20552076251346703-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/759bae0cc4da/10.1177_20552076251346703-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a755/12123116/84858cb2e0cc/10.1177_20552076251346703-fig3.jpg

相似文献

1
Harnessing GPT-4 for automated error detection in pathology reports: Implications for oncology diagnostics.利用GPT-4进行病理报告中的自动错误检测:对肿瘤诊断的影响。
Digit Health. 2025 May 29;11:20552076251346703. doi: 10.1177/20552076251346703. eCollection 2025 Jan-Dec.
2
Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy.GPT-4 在检测放射科报告错误方面的潜力:对报告准确性的影响。
Radiology. 2024 Apr;311(1):e232714. doi: 10.1148/radiol.232714.
3
Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models.用于放射学报告错误检测的大语言模型:闭源模型与符合隐私规定的开源模型的对比分析
Eur Radiol. 2025 Feb 20. doi: 10.1007/s00330-025-11438-y.
4
Evaluation of error detection and treatment recommendations in nucleic acid test reports using ChatGPT models.使用ChatGPT模型评估核酸检测报告中的错误检测和治疗建议
Clin Chem Lab Med. 2025 Apr 21. doi: 10.1515/cclm-2025-0089.
5
An assessment of ChatGPT in error detection for thyroid ultrasound reports: A comparative study with ultrasound physicians.ChatGPT在甲状腺超声报告错误检测中的评估:与超声医师的对比研究
Digit Health. 2025 Mar 13;11:20552076251326019. doi: 10.1177/20552076251326019. eCollection 2025 Jan-Dec.
6
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4.评估生成式预训练转换器(GPT)在临床决策中的应用:GPT-3.5 和 GPT-4 的对比分析。
J Med Internet Res. 2024 Jun 27;26:e54571. doi: 10.2196/54571.
7
Diagnostic accuracy of large language models in psychiatry.精神科大语言模型的诊断准确性。
Asian J Psychiatr. 2024 Oct;100:104168. doi: 10.1016/j.ajp.2024.104168. Epub 2024 Jul 25.
8
Large-Scale Validation of the Feasibility of GPT-4 as a Proofreading Tool for Head CT Reports.GPT-4作为头部CT报告校对工具可行性的大规模验证
Radiology. 2025 Jan;314(1):e240701. doi: 10.1148/radiol.240701.
9
Assessing GPT-4 multimodal performance in radiological image analysis.评估GPT-4在放射图像分析中的多模态性能。
Eur Radiol. 2025 Apr;35(4):1959-1965. doi: 10.1007/s00330-024-11035-5. Epub 2024 Aug 30.
10
Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports.评估生成式预训练变换器4(GPT-4)在规范放射学报告方面的性能。
Eur Radiol. 2024 Jun;34(6):3566-3574. doi: 10.1007/s00330-023-10384-x. Epub 2023 Nov 8.

本文引用的文献

1
Enhancing Physician-Patient Communication in Oncology Using GPT-4 Through Simplified Radiology Reports: Multicenter Quantitative Study.通过简化放射学报告利用GPT-4加强肿瘤学领域医患沟通:多中心定量研究
J Med Internet Res. 2025 Apr 17;27:e63786. doi: 10.2196/63786.
2
Enhancing doctor-patient communication using large language models for pathology report interpretation.使用大语言模型解读病理报告以加强医患沟通。
BMC Med Inform Decis Mak. 2025 Jan 23;25(1):36. doi: 10.1186/s12911-024-02838-z.
3
Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for AI-generated Radiology Reports.
利用专业放射科医生的专业知识来增强大语言模型对人工智能生成的放射学报告的评估。
Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:402-411. doi: 10.1109/ichi61247.2024.00058. Epub 2024 Aug 22.
4
Erratum for: Evaluating GPT-4V (GPT-4 with Vision) on Detection of Radiologic Findings on Chest Radiographs.《胸部X光片放射学表现检测中评估GPT-4V(带视觉的GPT-4)》勘误
Radiology. 2024 May;311(2):e249016. doi: 10.1148/radiol.249016.
5
Evaluating GPT-V4 (GPT-4 with Vision) on Detection of Radiologic Findings on Chest Radiographs.评估 GPT-V4(具有视觉功能的 GPT-4)在检测胸部 X 光片中放射学发现的能力。
Radiology. 2024 May;311(2):e233270. doi: 10.1148/radiol.233270.
6
Applications of Large Language Models in Pathology.大语言模型在病理学中的应用。
Bioengineering (Basel). 2024 Mar 31;11(4):342. doi: 10.3390/bioengineering11040342.
7
Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy.GPT-4 在检测放射科报告错误方面的潜力:对报告准确性的影响。
Radiology. 2024 Apr;311(1):e232714. doi: 10.1148/radiol.232714.
8
Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review.使用大型语言模型(如 ChatGPT)进行诊断医学的挑战和障碍,重点是数字病理学——近期的范围综述。
Diagn Pathol. 2024 Feb 27;19(1):43. doi: 10.1186/s13000-024-01464-7.
9
Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4).使用生成式预训练转换器 4(GPT-4)从非结构化组织病理学报告中提取结构化信息。
J Pathol. 2024 Mar;262(3):310-319. doi: 10.1002/path.6232. Epub 2023 Dec 14.
10
A novel image deep learning-based sub-centimeter pulmonary nodule management algorithm to expedite resection of the malignant and avoid over-diagnosis of the benign.一种基于新型图像深度学习的亚厘米肺结节管理算法,旨在加快恶性结节的切除,避免良性结节的过度诊断。
Eur Radiol. 2024 Mar;34(3):2048-2061. doi: 10.1007/s00330-023-10026-2. Epub 2023 Sep 2.