Zheng Licheng, Wang Lihui, Ou Yingfeng, Wang Li, Jian Caiqing, Zhu Yuemin
Key Laboratory of Advanced Medical Imaging and Intelligent Computing of Guizhou Province, Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, China.
University Lyon, INSA Lyon, CNRS, Inserm, IRP Metislab CREATIS UMR5220, U1294, Lyon, France.
Med Phys. 2025 Apr 11. doi: 10.1002/mp.17809.
Semi-supervised medical image segmentation methods based on mean teacher (MT) framework provide a promising means for addressing the dense prediction problems with limited annotated images and numerous unlabeled images. However, the confirmation bias caused by the distribution difference between labeled and unlabeled data and the parameters-coupling problem of MT prevent the model from further improving the segmentation performance.
To reduce confirmation bias and alleviate the parameter coupling problem in MT framework, a novel data augmentation strategy and a cross exponential moving averaging (crossEMA) architecture are proposed in this work.
Specifically, a dual swap mixing data augmentation method was first proposed, which exchanges the patches between labeled and unlabeled images twice to decrease the confirmation bias caused by distribution divergency. Subsequently, a novel architecture for both student and teacher networks was designed with structurally identical dual decoders, one of which adopted a dropout operation. Labeled, unlabeled, and mixed images are fed into this MT architecture. For unlabeled data, the pseudo-labels generated by the dual decoders of the teacher network were used to supervise the predictions of the corresponding decoders of the student network. For mixed data, the real labels of the labeled data are mixed with the pseudo-labels of the unlabeled data predicted by the teacher network to form the supervisory information, which is used to constrain the prediction consistency for mixed data between student and teacher networks. To overcome the parameter coupling problem between the student and teacher networks, the encoder parameters of the teacher network were updated using an exponential moving average (EMA) strategy, while its dual decoder parameters were updated using a cross EMA strategy, which means the perturbed decoder parameters of the student network were updated with the non-perturbed decoder parameters of the student network and vice versa.
By comparing with several state-of-the-art (SOTA) semi-supervised segmentation methods on four publicly available datasets, we validated that the proposed method outperforms existing models. The Dice similarity coefficient (DSC) and volume similarity (VS) were improved by at least 2.33% and 1.86%, respectively, compared to the corresponding sub-optimal methods. Through multiple ablation experiments, we verified that the proposed dual swap strategy can reduce the distributional differences between unlabeled data and labeled+mixed data. In addition, the cross EMA strategy can avoid early convergence of the student and teacher networks.
The proposed strategies can alleviate the confirmation bias caused by the distribution discrepancy between labeled and unlabeled data in semi-supervised learning, as well as the issue of parameter coupling between the student and teacher networks in the MT architecture, providing therefore a promising approach to semi-supervised medical image segmentation.
基于均值教师(MT)框架的半监督医学图像分割方法为解决带注释图像有限和无标签图像众多的密集预测问题提供了一种很有前景的手段。然而,标记数据和未标记数据之间的分布差异导致的确认偏差以及MT的参数耦合问题阻碍了模型进一步提高分割性能。
为减少确认偏差并缓解MT框架中的参数耦合问题,本文提出了一种新颖的数据增强策略和交叉指数移动平均(crossEMA)架构。
具体而言,首先提出了一种双重交换混合数据增强方法,该方法在标记图像和未标记图像之间两次交换图像块,以减少分布差异导致的确认偏差。随后,为学生网络和教师网络设计了一种新颖的架构,其具有结构相同的双重解码器,其中一个采用随机失活操作。将标记图像、未标记图像和混合图像输入到这个MT架构中。对于未标记数据,教师网络的双重解码器生成的伪标签用于监督学生网络相应解码器的预测。对于混合数据,标记数据的真实标签与教师网络预测的未标记数据的伪标签混合,形成监督信息,用于约束学生网络和教师网络之间混合数据的预测一致性。为克服学生网络和教师网络之间的参数耦合问题,教师网络的编码器参数采用指数移动平均(EMA)策略进行更新,而其双重解码器参数采用交叉EMA策略进行更新,这意味着学生网络受扰动的解码器参数用学生网络未受扰动的解码器参数进行更新,反之亦然。
通过在四个公开可用数据集上与几种最新的(SOTA)半监督分割方法进行比较,我们验证了所提出的方法优于现有模型。与相应的次优方法相比,骰子相似系数(DSC)和体积相似性(VS)分别至少提高了2.33%和1.86%。通过多次消融实验,我们验证了所提出的双重交换策略可以减少未标记数据与标记+混合数据之间的分布差异。此外,交叉EMA策略可以避免学生网络和教师网络的早期收敛。
所提出的策略可以缓解半监督学习中标记数据和未标记数据之间的分布差异导致的确认偏差,以及MT架构中学生网络和教师网络之间的参数耦合问题,因此为半监督医学图像分割提供了一种很有前景的方法。