Suppr
超能文献

从多个基础模型中提取知识以实现零样本图像分类。

Distilling knowledge from multiple foundation models for zero-shot image classification.

机构信息

School of Computer Science and Technology, Shandong University of Science and Technology, Qingdao, Shandong, China.

出版信息

PLoS One. 2024 Sep 20;19(9):e0310730. doi: 10.1371/journal.pone.0310730. eCollection 2024.

DOI:10.1371/journal.pone.0310730

PMID:39302937

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11414985/

Abstract

Zero-shot image classification enables the recognition of new categories without requiring additional training data, thereby enhancing the model's generalization capability when specific training are unavailable. This paper introduces a zero-shot image classification framework to recognize new categories that are unseen during training by distilling knowledge from foundation models. Specifically, we first employ ChatGPT and DALL-E to synthesize reference images of unseen categories from text prompts. Then, the test image is aligned with text and reference images using CLIP and DINO to calculate the logits. Finally, the predicted logits are aggregated according to their confidence to produce the final prediction. Experiments are conducted on multiple datasets, including MNIST, SVHN, CIFAR-10, CIFAR-100, and TinyImageNet. The results demonstrate that our method can significantly improve classification accuracy compared to previous approaches, achieving AUROC scores of over 96% across all test datasets. Our code is available at https://github.com/1134112149/MICW-ZIC.

摘要

零样本图像分类使识别新类别成为可能，而无需额外的训练数据，从而提高了模型在没有特定训练时的泛化能力。本文介绍了一种零样本图像分类框架，通过从基础模型中提取知识，识别训练中未见过的新类别。具体来说，我们首先使用 ChatGPT 和 DALL-E 根据文本提示从文本中合成看不见类别的参考图像。然后，使用 CLIP 和 DINO 将测试图像与文本和参考图像对齐，以计算日志。最后，根据置信度对预测的日志进行聚合，以生成最终预测。在多个数据集上进行了实验，包括 MNIST、SVHN、CIFAR-10、CIFAR-100 和 TinyImageNet。实验结果表明，与之前的方法相比，我们的方法可以显著提高分类准确性，在所有测试数据集上的 AUROC 得分均超过 96%。我们的代码可在 https://github.com/1134112149/MICW-ZIC 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1744/11414985/5cce6feb7da7/pone.0310730.g001.jpg

相似文献

Distilling knowledge from multiple foundation models for zero-shot image classification.

PLoS One. 2024 Sep 20;19(9):e0310730. doi: 10.1371/journal.pone.0310730. eCollection 2024.

Significantly improving zero-shot X-ray pathology classification via fine-tuning pre-trained image-text encoders.

Sci Rep. 2024 Oct 5;14(1):23199. doi: 10.1038/s41598-024-73695-z.

Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.

J Biomed Inform. 2023 Dec;148:104556. doi: 10.1016/j.jbi.2023.104556. Epub 2023 Dec 2.

Utilizing Geographical Distribution Statistical Data to Improve Zero-Shot Species Recognition.

Animals (Basel). 2024 Jun 7;14(12):1716. doi: 10.3390/ani14121716.

Open-Pose 3D zero-shot learning: Benchmark and challenges.

Neural Netw. 2025 Jan;181:106775. doi: 10.1016/j.neunet.2024.106775. Epub 2024 Oct 9.

Proto-Adapter: Efficient Training-Free CLIP-Adapter for Few-Shot Image Classification.

Sensors (Basel). 2024 Jun 4;24(11):3624. doi: 10.3390/s24113624.

Adapting low-dose CT denoisers for texture preservation using zero-shot local noise-level matching.

Med Phys. 2024 Jun;51(6):4181-4200. doi: 10.1002/mp.17015. Epub 2024 Mar 13.

ZeroNAS: Differentiable Generative Adversarial Networks Search for Zero-Shot Learning.

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9733-9740. doi: 10.1109/TPAMI.2021.3127346. Epub 2022 Nov 7.

Light-Weight Deformable Registration Using Adversarial Learning With Distilling Knowledge.

IEEE Trans Med Imaging. 2022 Jun;41(6):1443-1453. doi: 10.1109/TMI.2022.3141013. Epub 2022 Jun 1.

KBPT: knowledge-based prompt tuning for zero-shot relation triplet extraction.

PeerJ Comput Sci. 2024 May 24;10:e2014. doi: 10.7717/peerj-cs.2014. eCollection 2024.

本文引用的文献

Evaluating TCFD reporting-A new application of zero-shot analysis to climate-related financial disclosures.

PLoS One. 2023 Nov 2;18(11):e0288052. doi: 10.1371/journal.pone.0288052. eCollection 2023.

Orientational Distribution Learning With Hierarchical Spatial Attention for Open Set Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8757-8772. doi: 10.1109/TPAMI.2022.3227913. Epub 2023 Jun 5.

Class-Specific Semantic Reconstruction for Open Set Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4214-4228. doi: 10.1109/TPAMI.2022.3200384. Epub 2023 Mar 7.

Multi-label zero-shot learning with graph convolutional networks.

Neural Netw. 2020 Dec;132:333-341. doi: 10.1016/j.neunet.2020.09.010. Epub 2020 Sep 21.

A zero-shot learning approach to the development of brain-computer interfaces for image retrieval.

PLoS One. 2019 Sep 16;14(9):e0214342. doi: 10.1371/journal.pone.0214342. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

从多个基础模型中提取知识以实现零样本图像分类。

Distilling knowledge from multiple foundation models for zero-shot image classification.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译