Pulido J Vince, Guleria Shan, Ehsan Lubaina, Fasullo Matthew, Lippman Robert, Mutha Pritesh, Shah Tilak, Syed Sana, Brown Donald E
Applied Physics Laboratory, Johns Hopkins University, Laurel, MD.
Dept. of Internal Medicine, Rush University Medical Center, Chicago, IL.
Proc IEEE Int Symp Bioinformatics Bioeng. 2020 Oct;2020:563-568. doi: 10.1109/BIBE50027.2020.00097. Epub 2020 Dec 16.
One of the greatest obstacles in the adoption of deep neural networks for new medical applications is that training these models typically require a large amount of manually labeled training samples. In this body of work, we investigate the semi-supervised scenario where one has access to large amounts of unlabeled data and only a few labeled samples. We study the performance of MixMatch and FixMatch-two popular semi-supervised learning methods-on a histology dataset. More specifically, we study these models' impact under a highly noisy and imbalanced setting. The findings here motivate the development of semi-supervised methods to ameliorate problems commonly encountered in medical data applications.
将深度神经网络应用于新的医学领域时,最大的障碍之一是训练这些模型通常需要大量人工标注的训练样本。在本研究中,我们探讨了半监督学习场景,即可以获取大量未标注数据,仅有少量标注样本。我们在一个组织学数据集上研究了两种流行的半监督学习方法——MixMatch和FixMatch的性能。更具体地说,我们研究了这些模型在高噪声和不平衡环境下的影响。这里的研究结果推动了半监督方法的发展,以改善医学数据应用中常见的问题。