使用小型和不平衡医学影像数据集对监督学习和自监督学习进行比较分析。

Comparative analysis of supervised and self-supervised learning with small and imbalanced medical imaging datasets.

作者信息

Espis Andrea, Marzi Chiara, Diciotti Stefano

机构信息

Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi" - DEI, University of Bologna, Via dell'Università 50, 47521, Cesena, Italy.

Department of Statistics, Computer Science and Applications "Giuseppe Parenti", University of Florence, 50134, Florence, Italy.

出版信息

Sci Rep. 2025 Sep 2;15(1):32345. doi: 10.1038/s41598-025-99000-0.

DOI:10.1038/s41598-025-99000-0

PMID:40897785

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12405560/

Abstract

Self-supervised learning (SSL) in computer vision has shown its potential to reduce reliance on labeled data. However, most studies focused on balanced, large, broad-domain datasets like ImageNet, whereas, in real-world medical applications, dataset size is typically limited. This study compares the performance of SSL versus supervised learning (SL) on small, imbalanced medical imaging datasets. We experimented with four binary classification tasks: age prediction and diagnosis of Alzheimer's disease from brain magnetic resonance imaging scans, pneumonia from chest radiograms, and retinal diseases associated with choroidal neovascularization from optical coherence tomography with a mean size of training sets of 843 images, 771 images, 1,214 images, and 33,484 images, respectively. We tested various combinations of label availability and class frequency distribution, repeating the training with different random seeds to assess result uncertainty. In most experiments involving small training sets, SL outperformed the selected SSL paradigms, even when a limited portion of labeled data was available. Our findings highlight the importance of carefully selecting learning paradigms based on specific application requirements, which are influenced by factors such as training set size, label availability, and class frequency distribution.

摘要

计算机视觉中的自监督学习（SSL）已显示出其减少对标记数据依赖的潜力。然而，大多数研究集中在像ImageNet这样平衡、大型、广泛领域的数据集上，而在实际的医学应用中，数据集大小通常是有限的。本研究比较了SSL与监督学习（SL）在小型、不平衡医学成像数据集上的性能。我们对四个二分类任务进行了实验：从脑磁共振成像扫描预测年龄和诊断阿尔茨海默病、从胸部X光片诊断肺炎，以及从光学相干断层扫描诊断与脉络膜新生血管相关的视网膜疾病，训练集的平均大小分别为843张图像、771张图像、1214张图像和33484张图像。我们测试了标签可用性和类别频率分布的各种组合，使用不同的随机种子重复训练以评估结果的不确定性。在大多数涉及小型训练集的实验中，即使只有有限部分的标记数据可用，SL的表现也优于所选的SSL范式。我们的研究结果强调了根据特定应用需求仔细选择学习范式的重要性，这些需求会受到训练集大小、标签可用性和类别频率分布等因素的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5923/12405560/02021227a074/41598_2025_99000_Fig1_HTML.jpg

相似文献

Comparative analysis of supervised and self-supervised learning with small and imbalanced medical imaging datasets.使用小型和不平衡医学影像数据集对监督学习和自监督学习进行比较分析。

Sci Rep. 2025 Sep 2;15(1):32345. doi: 10.1038/s41598-025-99000-0.

Semi-Supervised Learning Allows for Improved Segmentation With Reduced Annotations of Brain Metastases Using Multicenter MRI Data.半监督学习可利用多中心MRI数据，通过减少脑转移瘤的标注来改进分割。

J Magn Reson Imaging. 2025 Jun;61(6):2469-2479. doi: 10.1002/jmri.29686. Epub 2025 Jan 10.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Explainable self-supervised learning for medical image diagnosis based on DINO V2 model and semantic search.基于DINO V2模型和语义搜索的可解释自监督医学图像诊断学习

Sci Rep. 2025 Sep 1;15(1):32174. doi: 10.1038/s41598-025-15604-6.

Boundary-aware information maximization for self-supervised medical image segmentation.用于自监督医学图像分割的边界感知信息最大化

Med Image Anal. 2024 May;94:103150. doi: 10.1016/j.media.2024.103150. Epub 2024 Mar 28.

Self-Supervised Learning for Improved Optical Coherence Tomography Detection of Macular Telangiectasia Type 2.基于自监督学习的黄斑毛细血管扩张症 2 型光学相干断层扫描检测方法的研究

JAMA Ophthalmol. 2024 Mar 1;142(3):226-233. doi: 10.1001/jamaophthalmol.2023.6454.

A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。

Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.

A segment anything model-guided and match-based semi-supervised segmentation framework for medical imaging.一种用于医学成像的基于段式分割模型引导和匹配的半监督分割框架。

Med Phys. 2025 Mar 29. doi: 10.1002/mp.17785.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

OCT-SelfNet: a self-supervised framework with multi-source datasets for generalized retinal disease detection.OCT-SelfNet：一个用于广义视网膜疾病检测的具有多源数据集的自监督框架。

Front Big Data. 2025 Jul 29;8:1609124. doi: 10.3389/fdata.2025.1609124. eCollection 2025.

本文引用的文献

Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels.挖掘自身解剖结构：利用极其有限的标签重新审视医学图像分割

IEEE Trans Pattern Anal Mach Intell. 2024 Sep 13;PP. doi: 10.1109/TPAMI.2024.3461321.

A Survey on Self-Supervised Learning: Algorithms, Applications, and Future Trends.自监督学习综述：算法、应用及未来趋势

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):9052-9071. doi: 10.1109/TPAMI.2024.3415112. Epub 2024 Nov 6.

Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts.基于随机专家的医学图像分割隐式解剖渲染

Med Image Comput Comput Assist Interv. 2023 Oct;14222:561-571. doi: 10.1007/978-3-031-43898-1_54. Epub 2023 Oct 1.

ACTION++: Improving Semi-supervised Medical Image Segmentation with Adaptive Anatomical Contrast.ACTION++：利用自适应解剖对比度改进半监督医学图像分割

Med Image Comput Comput Assist Interv. 2023 Oct;14223:194-205. doi: 10.1007/978-3-031-43901-8_19. Epub 2023 Oct 1.

Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective.重新审视半监督医学图像分割：基于方差缩减的视角

Adv Neural Inf Process Syst. 2023 Dec;36:9984-10021.

Exploring simple triplet representation learning.探索简单的三元组表示学习。

Comput Struct Biotechnol J. 2024 Apr 12;23:1510-1521. doi: 10.1016/j.csbj.2024.04.004. eCollection 2024 Dec.

Opportunities and challenges of artificial intelligence and distributed systems to improve the quality of healthcare service.人工智能和分布式系统提高医疗服务质量的机遇与挑战。

Artif Intell Med. 2024 Mar;149:102779. doi: 10.1016/j.artmed.2024.102779. Epub 2024 Jan 24.

Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging.基于对比和掩蔽自动编码器方法的自监督预训练在医学影像深度学习中小数据集处理中的应用。

Sci Rep. 2023 Nov 20;13(1):20260. doi: 10.1038/s41598-023-46433-0.

Class-Aware Adversarial Transformers for Medical Image Segmentation.用于医学图像分割的类别感知对抗变压器

Adv Neural Inf Process Syst. 2022 Dec;35:29582-29596.

Dive into the details of self-supervised learning for medical image analysis.深入探索医学图像分析中的自监督学习细节。

Med Image Anal. 2023 Oct;89:102879. doi: 10.1016/j.media.2023.102879. Epub 2023 Jun 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用小型和不平衡医学影像数据集对监督学习和自监督学习进行比较分析。

Comparative analysis of supervised and self-supervised learning with small and imbalanced medical imaging datasets.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献