Suppr超能文献

评估自监督学习在减少胸部 X 光成像标注图像中的有效性。

Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images.

机构信息

Faculty of Information Technology, Tokyo City University, 1-28-1 Tamazutsumi, Setagaya-ku, Tokyo, 158-8557, Japan.

出版信息

J Imaging Inform Med. 2024 Aug;37(4):1618-1624. doi: 10.1007/s10278-024-00975-5. Epub 2024 Mar 8.

Abstract

A significant challenge in machine learning-based medical image analysis is the scarcity of medical images. Obtaining a large number of labeled medical images is difficult because annotating medical images is a time-consuming process that requires specialized knowledge. In addition, inappropriate annotation processes can increase model bias. Self-supervised learning (SSL) is a type of unsupervised learning method that extracts image representations. Thus, SSL can be an effective method to reduce the number of labeled images. In this study, we investigated the feasibility of reducing the number of labeled images in a limited set of unlabeled medical images. The unlabeled chest X-ray (CXR) images were pretrained using the SimCLR framework, and then the representations were fine-tuned as supervised learning for the target task. A total of 2000 task-specific CXR images were used to perform binary classification of coronavirus disease 2019 (COVID-19) and normal cases. The results demonstrate that the performance of pretraining on task-specific unlabeled CXR images can be maintained when the number of labeled CXR images is reduced by approximately 40%. In addition, the performance was significantly better than that obtained without pretraining. In contrast, a large number of pretrained unlabeled images are required to maintain performance regardless of task specificity among a small number of labeled CXR images. In summary, to reduce the number of labeled images using SimCLR, we must consider both the number of images and the task-specific characteristics of the target images.

摘要

在基于机器学习的医学图像分析中,一个重大挑战是医学图像的稀缺性。获取大量带标签的医学图像是困难的,因为标注医学图像是一个耗时的过程,需要专业知识。此外,不恰当的标注过程会增加模型的偏差。自监督学习(SSL)是一种无监督学习方法,可以提取图像表示。因此,SSL 可以是减少带标签图像数量的有效方法。在这项研究中,我们研究了在有限数量的未标记医学图像中减少带标签图像数量的可行性。使用 SimCLR 框架对未标记的胸部 X 光(CXR)图像进行预训练,然后将表示作为目标任务的监督学习进行微调。总共使用了 2000 张特定任务的 CXR 图像来执行 2019 年冠状病毒病(COVID-19)和正常病例的二进制分类。结果表明,当 CXR 图像的带标签数量减少约 40%时,对特定任务的未标记 CXR 图像进行预训练的性能可以保持。此外,性能明显优于没有预训练的性能。相比之下,无论 CXR 图像的带标签数量多少,都需要大量的预训练未标记图像来保持性能,这取决于目标图像的任务特异性。总之,要使用 SimCLR 减少带标签图像的数量,我们必须考虑图像的数量和目标图像的任务特异性特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e630/11300406/0d5011d02480/10278_2024_975_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验