• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估 GPT-4Vision 在神经退行性疾病组织病理学中少样本学习的效果:与卷积神经网络模型的比较分析。

Evaluating the efficacy of few-shot learning for GPT-4Vision in neurodegenerative disease histopathology: A comparative analysis with convolutional neural network model.

机构信息

Department of Neuroscience, Mayo Clinic, Jacksonville, Florida, USA.

Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania, USA.

出版信息

Neuropathol Appl Neurobiol. 2024 Aug;50(4):e12997. doi: 10.1111/nan.12997.

DOI:10.1111/nan.12997
PMID:39010256
Abstract

AIMS

Recent advances in artificial intelligence, particularly with large language models like GPT-4Vision (GPT-4V)-a derivative feature of ChatGPT-have expanded the potential for medical image interpretation. This study evaluates the accuracy of GPT-4V in image classification tasks of histopathological images and compares its performance with a traditional convolutional neural network (CNN).

METHODS

We utilised 1520 images, including haematoxylin and eosin staining and tau immunohistochemistry, from patients with various neurodegenerative diseases, such as Alzheimer's disease (AD), progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD). We assessed GPT-4V's performance using multi-step prompts to determine how textual context influences image interpretation. We also employed few-shot learning to enhance improvements in GPT-4V's diagnostic performance in classifying three specific tau lesions-astrocytic plaques, neuritic plaques and tufted astrocytes-and compared the outcomes with the CNN model YOLOv8.

RESULTS

GPT-4V accurately recognised staining techniques and tissue origin but struggled with specific lesion identification. The interpretation of images was notably influenced by the provided textual context, which sometimes led to diagnostic inaccuracies. For instance, when presented with images of the motor cortex, the diagnosis shifted inappropriately from AD to CBD or PSP. However, few-shot learning markedly improved GPT-4V's diagnostic capabilities, enhancing accuracy from 40% in zero-shot learning to 90% with 20-shot learning, matching the performance of YOLOv8, which required 100-shot learning to achieve the same accuracy.

CONCLUSIONS

Although GPT-4V faces challenges in independently interpreting histopathological images, few-shot learning significantly improves its performance. This approach is especially promising for neuropathology, where acquiring extensive labelled datasets is often challenging.

摘要

目的

最近人工智能领域的进展,尤其是像 GPT-4Vision(ChatGPT 的一个衍生功能)这样的大型语言模型的出现,拓宽了医学图像解释的潜力。本研究评估了 GPT-4V 在组织病理学图像分类任务中的准确性,并将其性能与传统的卷积神经网络(CNN)进行了比较。

方法

我们使用了 1520 张包括苏木精和伊红染色和 tau 免疫组化的图像,这些图像来自患有各种神经退行性疾病的患者,如阿尔茨海默病(AD)、进行性核上性麻痹(PSP)和皮质基底节变性(CBD)。我们使用多步提示来评估 GPT-4V 的性能,以确定文本上下文如何影响图像解释。我们还采用了少样本学习来提高 GPT-4V 在分类三种特定 tau 病变——星形胶质斑块、神经原纤维斑块和丛状星形胶质细胞——中的诊断性能,并将结果与 CNN 模型 YOLOv8 进行了比较。

结果

GPT-4V 准确识别了染色技术和组织来源,但在特定病变识别方面存在困难。图像的解释受到提供的文本上下文的显著影响,有时导致诊断不准确。例如,当呈现运动皮层的图像时,诊断不当从 AD 转移到 CBD 或 PSP。然而,少样本学习显著提高了 GPT-4V 的诊断能力,将零样本学习的准确率从 40%提高到 20 样本学习的 90%,与需要 100 样本学习才能达到相同准确率的 YOLOv8 相匹配。

结论

尽管 GPT-4V 在独立解释组织病理学图像方面面临挑战,但少样本学习显著提高了其性能。这种方法在神经病理学中特别有前景,因为在神经病理学中获取广泛的标记数据集通常具有挑战性。

相似文献

1
Evaluating the efficacy of few-shot learning for GPT-4Vision in neurodegenerative disease histopathology: A comparative analysis with convolutional neural network model.评估 GPT-4Vision 在神经退行性疾病组织病理学中少样本学习的效果:与卷积神经网络模型的比较分析。
Neuropathol Appl Neurobiol. 2024 Aug;50(4):e12997. doi: 10.1111/nan.12997.
2
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.揭示GPT-4V在美国医师执照考试(USMLE)问题上高精度背后的隐藏挑战:观察性研究。
J Med Internet Res. 2025 Feb 7;27:e65146. doi: 10.2196/65146.
3
Exploring Generative Pre-Trained Transformer-4-Vision for Nystagmus Classification: Development and Validation of a Pupil-Tracking Process.探索用于眼球震颤分类的生成式预训练变换器-4视觉模型:瞳孔追踪过程的开发与验证
JMIR Form Res. 2025 Jun 6;9:e70070. doi: 10.2196/70070.
4
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
5
Deep Learning for the Early Detection of Invasive Ductal Carcinoma in Histopathological Images: Convolutional Neural Network Approach With Transfer Learning.基于深度学习的组织病理学图像中浸润性导管癌早期检测:采用迁移学习的卷积神经网络方法
JMIR Form Res. 2025 Aug 21;9:e62996. doi: 10.2196/62996.
6
Performance and Reproducibility of Large Language Models in Named Entity Recognition: Considerations for the Use in Controlled Environments.大型语言模型在命名实体识别中的性能与可重复性:在受控环境中使用的考量
Drug Saf. 2025 Mar;48(3):287-303. doi: 10.1007/s40264-024-01499-1. Epub 2024 Dec 11.
7
Assessing the Diagnostic Capabilities of ChatGPT-4 Omni in Grading Diabetic Retinopathy Fundoscopy Using Color Fundus Photographs.评估ChatGPT-4 Omni利用彩色眼底照片对糖尿病视网膜病变眼底镜检查进行分级的诊断能力。
Clin Ophthalmol. 2025 Aug 31;19:3103-3112. doi: 10.2147/OPTH.S517238. eCollection 2025.
8
Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.利用晚期癌症患者腹部和骨盆 CT 图像建立卷积神经网络模型预测股骨近端病理性骨折的研究
Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.
9
Short-Term Memory Impairment短期记忆障碍
10
Evaluating Bard Gemini Pro and GPT-4 Vision Against Student Performance in Medical Visual Question Answering: Comparative Case Study.在医学视觉问答中评估Bard Gemini Pro和GPT-4 Vision对学生表现的影响:比较案例研究
JMIR Form Res. 2024 Dec 17;8:e57592. doi: 10.2196/57592.

引用本文的文献

1
Standardised TruAI Automated Quantification of Intracellular Neuromelanin Granules in Human Brain Tissue Sections.人脑组织切片中细胞内神经黑色素颗粒的标准化TruAI自动定量分析
Neuropathol Appl Neurobiol. 2025 Aug;51(4):e70033. doi: 10.1111/nan.70033.
2
Large language models for disease diagnosis: a scoping review.用于疾病诊断的大语言模型:一项范围综述。
NPJ Artif Intell. 2025;1(1):9. doi: 10.1038/s44387-025-00011-z. Epub 2025 Jun 9.
3
Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models.
检索增强生成与基于文档的生成:大语言模型中的一个关键区别。
J Pathol Clin Res. 2025 Jan;11(1):e70014. doi: 10.1002/2056-4538.70014.
4
Large language models as a diagnostic support tool in neuropathology.大语言模型在神经病理学中的诊断支持工具。
J Pathol Clin Res. 2024 Nov;10(6):e70009. doi: 10.1002/2056-4538.70009.
5
Advancing large language models in nephrology: bridging the gap in image interpretation.推进肾脏病学领域的大语言模型:弥合图像解读方面的差距。
Clin Exp Nephrol. 2025 Jan;29(1):128-129. doi: 10.1007/s10157-024-02581-9. Epub 2024 Oct 28.
6
The Potential of Chat-Based Artificial Intelligence Models in Differentiating Between Keloid and Hypertrophic Scars: A Pilot Study.基于聊天的人工智能模型在区分瘢痕疙瘩和增生性瘢痕方面的潜力:一项初步研究。
Aesthetic Plast Surg. 2024 Dec;48(24):5367-5372. doi: 10.1007/s00266-024-04380-9. Epub 2024 Sep 25.