Suppr超能文献

基于MixMatch和等价类的半监督机器学习

Semi-supervised Machine Learning with MixMatch and Equivalence Classes.

作者信息

Hansen Colin B, Nath Vishwesh, Gao Riqiang, Bermudez Camilo, Huo Yuankai, Sandler Kim L, Massion Pierre P, Blume Jeffrey D, Lasko Thomas A, Landman Bennett A

机构信息

Computer Science, Vanderbilt University, Nashville, TN 37235, USA.

Vanderbilt University Medical Center, Nashville, TN 37235, USA.

出版信息

Lect Notes Monogr Ser. 2020;12446:112-121. Epub 2020 Oct 2.

Abstract

Semi-supervised methods have an increasing impact on computer vision tasks to make use of scarce labels on large datasets, yet these approaches have not been well translated to medical imaging. Of particular interest, the MixMatch method achieves significant performance improvement over popular semi-supervised learning methods with scarce labels in the CIFAR-10 dataset. In a complementary approach, Nullspace Tuning on equivalence classes offers the potential to leverage multiple subject scans when the ground truth for the subject is unknown. This work is the first to (1) explore MixMatch with Nullspace Tuning in the context of medical imaging and (2) characterize the impacts of the methods with diminishing labels. We consider two distinct medical imaging domains: skin lesion diagnosis and lung cancer prediction. In both cases we evaluate models trained with diminishing labeled data using supervised, MixMatch, and Nullspace Tuning methods as well as MixMatch with Nullspace Tuning together. MixMatch with Nullspace Tuning together is able to achieve an AUC of 0.755 in lung cancer diagnosis with only 200 labeled subjects on the National Lung Screening Trial and a balanced multi-class accuracy of 77% with only 779 labeled examples on HAM10000. This performance is similar to that of the fully supervised methods when all labels are available. In advancing data driven methods in medical imaging, it is important to consider the use of current state-of-the-art semi-supervised learning methods from the greater machine learning community and their impact on the limitations of data acquisition and annotation.

摘要

半监督方法在利用大型数据集中稀缺标签的计算机视觉任务中发挥着越来越大的作用,但这些方法尚未很好地应用于医学成像领域。特别值得关注的是,MixMatch方法在CIFAR-10数据集中,相较于流行的带有稀缺标签的半监督学习方法,实现了显著的性能提升。作为一种补充方法,当个体的真实情况未知时,基于等价类的零空间调整提供了利用多个个体扫描数据的潜力。这项工作首次(1)在医学成像背景下探索结合零空间调整的MixMatch方法,以及(2)刻画标签数量减少时这些方法的影响。我们考虑两个不同的医学成像领域:皮肤病变诊断和肺癌预测。在这两种情况下,我们评估使用监督学习、MixMatch方法、零空间调整方法以及结合零空间调整的MixMatch方法训练的模型,这些模型使用的标记数据逐渐减少。结合零空间调整的MixMatch方法在国家肺癌筛查试验中,仅用200个标记个体就能在肺癌诊断中实现0.755的AUC,在HAM10000数据集上,仅用779个标记示例就能实现77%的平衡多类准确率。当所有标签都可用时,这种性能与完全监督方法相似。在推进医学成像中的数据驱动方法时,重要的是考虑采用来自更广泛机器学习社区的当前最先进的半监督学习方法,以及它们对数据采集和标注局限性的影响。

相似文献

1
Semi-supervised Machine Learning with MixMatch and Equivalence Classes.
Lect Notes Monogr Ser. 2020;12446:112-121. Epub 2020 Oct 2.
2
A real use case of semi-supervised learning for mammogram classification in a local clinic of Costa Rica.
Med Biol Eng Comput. 2022 Apr;60(4):1159-1175. doi: 10.1007/s11517-021-02497-6. Epub 2022 Mar 3.
4
EnAET: A Self-Trained Framework for Semi-Supervised and Supervised Learning With Ensemble Transformations.
IEEE Trans Image Process. 2021;30:1639-1647. doi: 10.1109/TIP.2020.3044220. Epub 2021 Jan 11.
6
Correcting data imbalance for semi-supervised COVID-19 detection using X-ray chest images.
Appl Soft Comput. 2021 Nov;111:107692. doi: 10.1016/j.asoc.2021.107692. Epub 2021 Jul 13.
7
Semi-supervised learning for medical image classification using imbalanced training data.
Comput Methods Programs Biomed. 2022 Apr;216:106628. doi: 10.1016/j.cmpb.2022.106628. Epub 2022 Jan 14.
8
Semi-Supervised Classification of Noisy, Gigapixel Histology Images.
Proc IEEE Int Symp Bioinformatics Bioeng. 2020 Oct;2020:563-568. doi: 10.1109/BIBE50027.2020.00097. Epub 2020 Dec 16.
9
Pseudo-labeling generative adversarial networks for medical image classification.
Comput Biol Med. 2022 Aug;147:105729. doi: 10.1016/j.compbiomed.2022.105729. Epub 2022 Jun 17.
10
Boosting semi-supervised learning with Contrastive Complementary Labeling.
Neural Netw. 2024 Feb;170:417-426. doi: 10.1016/j.neunet.2023.11.052. Epub 2023 Nov 27.

本文引用的文献

2
Distanced LSTM: Time-Distanced Gates in Long Short-Term Memory Models for Lung Cancer Detection.
Mach Learn Med Imaging. 2019 Oct;11861:310-318. doi: 10.1007/978-3-030-32692-0_36. Epub 2019 Oct 10.
3
Coronary Calcium Detection using 3D Attention Identical Dual Deep Network Based on Weakly Supervised Learning.
Proc SPIE Int Soc Opt Eng. 2019 Feb;10949. doi: 10.1117/12.2512541. Epub 2019 Mar 15.
4
Evaluate the Malignancy of Pulmonary Nodules Using the 3-D Deep Leaky Noisy-OR Network.
IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3484-3495. doi: 10.1109/TNNLS.2019.2892409. Epub 2019 Feb 14.
6
Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning.
IEEE Trans Pattern Anal Mach Intell. 2019 Aug;41(8):1979-1993. doi: 10.1109/TPAMI.2018.2858821. Epub 2018 Jul 23.
7
The National Lung Screening Trial: overview and study design.
Radiology. 2011 Jan;258(1):243-53. doi: 10.1148/radiol.10091808. Epub 2010 Nov 2.
8
Deep, big, simple neural nets for handwritten digit recognition.
Neural Comput. 2010 Dec;22(12):3207-20. doi: 10.1162/NECO_a_00052. Epub 2010 Sep 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验