Wang Xi, Tang Fangyao, Chen Hao, Cheung Carol Y, Heng Pheng-Ann
Zhejiang Lab, Hangzhou, China; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China; Department of Radiation Oncology, Stanford University School of Medicine, Palo Alto, CA, USA.
Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China.
Med Image Anal. 2023 Jan;83:102673. doi: 10.1016/j.media.2022.102673. Epub 2022 Oct 26.
Supervised deep learning has achieved prominent success in various diabetic macular edema (DME) recognition tasks from optical coherence tomography (OCT) volumetric images. A common problematic issue that frequently occurs in this field is the shortage of labeled data due to the expensive fine-grained annotations, which increases substantial difficulty in accurate analysis by supervised learning. The morphological changes in the retina caused by DME might be distributed sparsely in B-scan images of the OCT volume, and OCT data is often coarsely labeled at the volume level. Hence, the DME identification task can be formulated as a multiple instance classification problem that could be addressed by multiple instance learning (MIL) techniques. Nevertheless, none of previous studies utilize unlabeled data simultaneously to promote the classification accuracy, which is particularly significant for a high quality of analysis at the minimum annotation cost. To this end, we present a novel deep semi-supervised multiple instance learning framework to explore the feasibility of leveraging a small amount of coarsely labeled data and a large amount of unlabeled data to tackle this problem. Specifically, we come up with several modules to further improve the performance according to the availability and granularity of their labels. To warm up the training, we propagate the bag labels to the corresponding instances as the supervision of training, and propose a self-correction strategy to handle the label noise in the positive bags. This strategy is based on confidence-based pseudo-labeling with consistency regularization. The model uses its prediction to generate the pseudo-label for each weakly augmented input only if it is highly confident about the prediction, which is subsequently used to supervise the same input in a strongly augmented version. This learning scheme is also applicable to unlabeled data. To enhance the discrimination capability of the model, we introduce the Student-Teacher architecture and impose consistency constraints between two models. For demonstration, the proposed approach was evaluated on two large-scale DME OCT image datasets. Extensive results indicate that the proposed method improves DME classification with the incorporation of unlabeled data and outperforms competing MIL methods significantly, which confirm the feasibility of deep semi-supervised multiple instance learning at a low annotation cost.
监督式深度学习在从光学相干断层扫描(OCT)体积图像进行的各种糖尿病性黄斑水肿(DME)识别任务中取得了显著成功。该领域经常出现的一个常见问题是,由于精细标注成本高昂,导致标记数据短缺,这增加了监督学习进行准确分析的难度。DME引起的视网膜形态变化可能稀疏地分布在OCT体积的B扫描图像中,并且OCT数据通常在体积级别进行粗略标记。因此,DME识别任务可以被表述为一个多实例分类问题,可以通过多实例学习(MIL)技术来解决。然而,以前的研究都没有同时利用未标记数据来提高分类准确率,这对于以最低标注成本进行高质量分析尤为重要。为此,我们提出了一种新颖的深度半监督多实例学习框架,以探索利用少量粗略标记数据和大量未标记数据来解决此问题的可行性。具体而言,我们根据标签的可用性和粒度提出了几个模块来进一步提高性能。为了预热训练,我们将包标签传播到相应实例作为训练监督,并提出一种自校正策略来处理正包中的标签噪声。该策略基于具有一致性正则化的基于置信度的伪标签。该模型仅在对预测高度自信时才使用其预测为每个弱增强输入生成伪标签,随后该伪标签用于监督强增强版本中的相同输入。这种学习方案也适用于未标记数据。为了增强模型的辨别能力,我们引入了学生-教师架构,并在两个模型之间施加一致性约束。为了进行演示,我们在两个大规模DME OCT图像数据集上对提出的方法进行了评估。广泛的结果表明,所提出的方法通过纳入未标记数据提高了DME分类,并且明显优于竞争的MIL方法,这证实了深度半监督多实例学习在低标注成本下的可行性。