• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估ChatGPT-4在MASH肝纤维化组织病理学评估中的诊断准确性。

Assessing the diagnostic accuracy of ChatGPT-4 in the histopathological evaluation of liver fibrosis in MASH.

作者信息

Panzeri Davide, Laohawetwanit Thiyaphat, Akpinar Reha, De Carlo Camilla, Belsito Vincenzo, Terracciano Luigi, Aghemo Alessio, Pugliese Nicola, Chirico Giuseppe, Inverso Donato, Calderaro Julien, Sironi Laura, Di Tommaso Luca

机构信息

Department of Physics, University of Milano-Bicocca, Milan, Italy.

Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand.

出版信息

Hepatol Commun. 2025 Apr 30;9(5). doi: 10.1097/HC9.0000000000000695. eCollection 2025 May 1.

DOI:10.1097/HC9.0000000000000695
PMID:40304570
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045550/
Abstract

BACKGROUND

Large language models like ChatGPT have demonstrated potential in medical image interpretation, but their efficacy in liver histopathological analysis remains largely unexplored. This study aims to assess ChatGPT-4-vision's diagnostic accuracy, compared to liver pathologists' performance, in evaluating liver fibrosis (stage) in metabolic dysfunction-associated steatohepatitis.

METHODS

Digitized Sirius Red-stained images for 59 metabolic dysfunction-associated steatohepatitis tissue biopsy specimens were evaluated by ChatGPT-4 and 4 pathologists using the NASH-CRN staging system. Fields of view at increasing magnification levels, extracted by a senior pathologist or randomly selected, were shown to ChatGPT-4, asking for fibrosis staging. The diagnostic accuracy of ChatGPT-4 was compared with pathologists' evaluations and correlated to the collagen proportionate area for additional insights. All cases were further analyzed by an in-context learning approach, where the model learns from exemplary images provided during prompting.

RESULTS

ChatGPT-4's diagnostic accuracy was 81% when using images selected by a pathologist, while it decreased to 54% with randomly cropped fields of view. By employing an in-context learning approach, the accuracy increased to 88% and 77% for selected and random fields of view, respectively. This method enabled the model to fully and correctly identify the tissue structures characteristic of F4 stages, previously misclassified. The study also highlighted a moderate to strong correlation between ChatGPT-4's fibrosis staging and collagen proportionate area.

CONCLUSIONS

ChatGPT-4 showed remarkable results with a diagnostic accuracy overlapping those of expert liver pathologists. The in-context learning analysis, applied here for the first time to assess fibrosis deposition in metabolic dysfunction-associated steatohepatitis samples, was crucial in accurately identifying the key features of F4 cases, critical for early therapeutic decision-making. These findings suggest the potential for integrating large language models as supportive tools in diagnostic pathology.

摘要

背景

像ChatGPT这样的大语言模型已在医学图像解读中展现出潜力,但其在肝脏组织病理学分析中的功效仍很大程度上未被探索。本研究旨在评估ChatGPT-4-vision在评估代谢功能障碍相关脂肪性肝炎中的肝纤维化(阶段)时,与肝脏病理学家的表现相比,其诊断准确性。

方法

59份代谢功能障碍相关脂肪性肝炎组织活检标本的数字化天狼星红染色图像由ChatGPT-4和4名病理学家使用NASH-CRN分期系统进行评估。由一名资深病理学家提取或随机选择的不同放大倍数水平的视野展示给ChatGPT-4,要求其进行纤维化分期。将ChatGPT-4的诊断准确性与病理学家的评估进行比较,并与胶原比例面积相关联以获得更多见解。所有病例均通过上下文学习方法进一步分析,即模型从提示过程中提供的示例图像中学习。

结果

当使用病理学家选择的图像时,ChatGPT-4的诊断准确性为81%,而随机裁剪的视野使其降至54%。通过采用上下文学习方法,对于选择的和随机的视野,准确性分别提高到88%和77%。这种方法使模型能够完全且正确地识别先前被错误分类的F4阶段的组织结构特征。该研究还强调了ChatGPT-4的纤维化分期与胶原比例面积之间存在中度至强相关性。

结论

ChatGPT-4显示出显著结果,其诊断准确性与肝脏病理专家相当。本文首次应用的上下文学习分析对于准确识别F4病例的关键特征至关重要,这对早期治疗决策至关重要,这些发现表明将大语言模型整合为诊断病理学辅助工具的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/1912aa83022e/hc9-9-e0695-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/443b9cdb5c37/hc9-9-e0695-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/1e2fbf91ec17/hc9-9-e0695-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/d08a78b16657/hc9-9-e0695-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/987a17c56700/hc9-9-e0695-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/3cfe693b797c/hc9-9-e0695-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/1912aa83022e/hc9-9-e0695-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/443b9cdb5c37/hc9-9-e0695-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/1e2fbf91ec17/hc9-9-e0695-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/d08a78b16657/hc9-9-e0695-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/987a17c56700/hc9-9-e0695-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/3cfe693b797c/hc9-9-e0695-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e85/12045550/1912aa83022e/hc9-9-e0695-g006.jpg

相似文献

1
Assessing the diagnostic accuracy of ChatGPT-4 in the histopathological evaluation of liver fibrosis in MASH.评估ChatGPT-4在MASH肝纤维化组织病理学评估中的诊断准确性。
Hepatol Commun. 2025 Apr 30;9(5). doi: 10.1097/HC9.0000000000000695. eCollection 2025 May 1.
2
Utility of AI digital pathology as an aid for pathologists scoring fibrosis in MASH.人工智能数字病理学在协助病理学家对MASH中的纤维化进行评分方面的效用。
J Hepatol. 2025 May;82(5):898-908. doi: 10.1016/j.jhep.2024.11.032. Epub 2024 Nov 28.
3
High-Throughput, Machine Learning-Based Quantification of Steatosis, Inflammation, Ballooning, and Fibrosis in Biopsies From Patients With Nonalcoholic Fatty Liver Disease.高通量、基于机器学习的非酒精性脂肪性肝病患者肝活检组织中脂肪变性、炎症、气球样变、纤维化的定量分析。
Clin Gastroenterol Hepatol. 2020 Aug;18(9):2081-2090.e9. doi: 10.1016/j.cgh.2019.12.025. Epub 2019 Dec 27.
4
Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性:视觉数据整合的影响。
JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.
5
Diagnostic accuracy of FibroScan-AST (FAST) score, non-alcoholic fatty liver fibrosis score (NFS), FibroScan, and liver fibrosis index (FIB-4) for identifying fibrotic non-alcoholic steatohepatitis in patients with chronic hepatitis B with metabolic dysfunction-associated fatty liver disease.诊断准确性的 FibroScan-AST (FAST) 评分、非酒精性脂肪性肝纤维化评分 (NFS)、FibroScan 和肝纤维化指数 (FIB-4) 在识别代谢相关脂肪性肝病合并慢性乙型肝炎患者的纤维化非酒精性脂肪性肝炎。
Ann Med. 2024 Dec;56(1):2420858. doi: 10.1080/07853890.2024.2420858. Epub 2024 Oct 26.
6
Automated quantification and architectural pattern detection of hepatic fibrosis in NAFLD.非酒精性脂肪性肝病肝纤维化的自动量化和结构模式检测。
Ann Diagn Pathol. 2020 Aug;47:151518. doi: 10.1016/j.anndiagpath.2020.151518. Epub 2020 Apr 12.
7
Liver biopsy-based validation, confirmation and comparison of the diagnostic performance of established and novel non-invasive steatotic liver disease indexes: Results from a large multi-center study.基于肝活检的验证、确认和比较新型及传统非酒精性脂肪性肝病无创性诊断指标的诊断效能:一项多中心大样本研究。
Metabolism. 2023 Oct;147:155666. doi: 10.1016/j.metabol.2023.155666. Epub 2023 Jul 30.
8
qFIBS: An Automated Technique for Quantitative Evaluation of Fibrosis, Inflammation, Ballooning, and Steatosis in Patients With Nonalcoholic Steatohepatitis.qFIBS:一种用于非酒精性脂肪性肝炎患者纤维化、炎症、气球样变和脂肪变性定量评估的自动化技术。
Hepatology. 2020 Jun;71(6):1953-1966. doi: 10.1002/hep.30986. Epub 2020 May 7.
9
Development of a diagnostic support system for the fibrosis of nonalcoholic fatty liver disease using artificial intelligence and deep learning.利用人工智能和深度学习开发非酒精性脂肪性肝病纤维化诊断支持系统
Kaohsiung J Med Sci. 2024 Aug;40(8):757-765. doi: 10.1002/kjm2.12850. Epub 2024 May 31.
10
Thinking like a pathologist: Morphologic approach to hepatobiliary tumors by ChatGPT.像病理学家一样思考:ChatGPT对肝胆肿瘤的形态学研究方法
Am J Clin Pathol. 2025 Jan 28;163(1):3-11. doi: 10.1093/ajcp/aqae087.

引用本文的文献

1
Reply: Diagnostic accuracy of ChatGPT-4 and liver fibrosis in MASH.回复:ChatGPT-4与MASH中肝纤维化的诊断准确性。
Hepatol Commun. 2025 Jul 21;9(8). doi: 10.1097/HC9.0000000000000764. eCollection 2025 Aug 1.
2
Letter to the Editor: "Diagnostic accuracy of ChatGPT-4 and liver fibrosis in MASH".致编辑的信:“ChatGPT-4与MASH中肝纤维化的诊断准确性”
Hepatol Commun. 2025 Jul 21;9(8). doi: 10.1097/HC9.0000000000000763. eCollection 2025 Aug 1.
3
Emerging applications of natural language processing for the identification of steatotic liver disease.

本文引用的文献

1
In-context learning enables multimodal large language models to classify cancer pathology images.语境学习使多模态大型语言模型能够对癌症病理学图像进行分类。
Nat Commun. 2024 Nov 21;15(1):10104. doi: 10.1038/s41467-024-51465-9.
2
Role of artificial intelligence in staging and assessing of treatment response in MASH patients.人工智能在MASH患者分期及治疗反应评估中的作用。
Front Med (Lausanne). 2024 Oct 21;11:1480866. doi: 10.3389/fmed.2024.1480866. eCollection 2024.
3
Evaluation and mitigation of the limitations of large language models in clinical decision-making.
自然语言处理在脂肪性肝病识别中的新兴应用
Hepatol Commun. 2025 Jul 14;9(8). doi: 10.1097/HC9.0000000000000774. eCollection 2025 Aug 1.
评估和缓解大型语言模型在临床决策中的局限性。
Nat Med. 2024 Sep;30(9):2613-2622. doi: 10.1038/s41591-024-03097-1. Epub 2024 Jul 4.
4
A multimodal generative AI copilot for human pathology.用于人体病理学的多模态生成式人工智能副驾。
Nature. 2024 Oct;634(8033):466-473. doi: 10.1038/s41586-024-07618-3. Epub 2024 Jun 12.
5
Comparative analysis of ChatGPT and Bard in answering pathology examination questions requiring image interpretation.比较分析 ChatGPT 和 Bard 在回答需要图像解读的病理学检查问题方面的表现。
Am J Clin Pathol. 2024 Sep 3;162(3):252-260. doi: 10.1093/ajcp/aqae036.
6
Fibrosis severity scoring on Sirius red histology with multiple-instance deep learning.基于多实例深度学习的天狼星红组织学纤维化严重程度评分
Biol Imaging. 2023 Jul 18;3:e17. doi: 10.1017/S2633903X23000144. eCollection 2023.
7
A visual-language foundation model for computational pathology.用于计算病理学的视觉-语言基础模型。
Nat Med. 2024 Mar;30(3):863-874. doi: 10.1038/s41591-024-02856-4. Epub 2024 Mar 19.
8
Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review.使用大型语言模型(如 ChatGPT)进行诊断医学的挑战和障碍,重点是数字病理学——近期的范围综述。
Diagn Pathol. 2024 Feb 27;19(1):43. doi: 10.1186/s13000-024-01464-7.
9
Adapted large language models can outperform medical experts in clinical text summarization.经过改编的大型语言模型在临床文本总结方面的表现优于医学专家。
Nat Med. 2024 Apr;30(4):1134-1142. doi: 10.1038/s41591-024-02855-5. Epub 2024 Feb 27.
10
Artificial intelligence compared with human-derived patient educational materials on cirrhosis.人工智能与人类生成的肝硬化患者教育材料的比较。
Hepatol Commun. 2024 Feb 14;8(3). doi: 10.1097/HC9.0000000000000367. eCollection 2024 Mar 1.