Ge Rongjun, Wang Chong, Liu Yuxin, Lu Chunqiang, Xia Cong, Jiang Yehui, Xu Fangyi, Zhu Yinsu, Zhang Daoqiang, Liu Chengyu, Chen Yang, Li Shuo, He Yuting
School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China.
College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China.
Neural Netw. 2025 Jul 13;192:107865. doi: 10.1016/j.neunet.2025.107865.
In the field of medical image segmentation, the scarcity of labeled data poses a major challenge for existing models to accurately perceive target regions. Compared with manual annotation, gaze data is easier and cheaper to obtain. As a classical semi-supervised learning framework, mean-teacher can effectively use a large number of unlabeled medical images for stable training through self-teaching and collaborative optimization. Our study is based on the mean-teacher framework. By combining gaze data, it aims to address two crucial issues in semi-supervised medical image segmentation: 1) expand the scale and diversity of the dataset with limited labeled data; 2) enhance the network's perception ability. We propose the Human Gaze-based Dual Teacher Guidance Learning model (HG-DTGL). In this model, human gaze serves as an additional hidden 'teacher' in the mean-teacher architecture. We introduce the GazeMix to generate reliable mixed data to expand the diversity and scale of the dataset, and the Multi-scale Gaze Perception (MGP) module is used to extract the multi-scale perception of the network. A Gaze Loss is designed to align the model's perception with human gaze. We have verified HG-DTGL on multiple datasets of different modalities and achieved superior performance on a total of ten different organs/tissues, with extensive experiments. This demonstrates that our method has strong generalization ability for medical images of different modalities, and shows the great application potential of gaze data in semi-supervised medical image segmentation.
在医学图像分割领域,标注数据的稀缺对现有模型准确感知目标区域构成了重大挑战。与人工标注相比,注视数据更容易获取且成本更低。作为一种经典的半监督学习框架,均值教师模型可以通过自我学习和协同优化有效地利用大量未标注的医学图像进行稳定训练。我们的研究基于均值教师框架。通过结合注视数据,旨在解决半监督医学图像分割中的两个关键问题:1)利用有限的标注数据扩展数据集的规模和多样性;2)增强网络的感知能力。我们提出了基于人类注视的双教师引导学习模型(HG-DTGL)。在这个模型中,人类注视在均值教师架构中充当额外的隐藏“教师”。我们引入注视混合生成可靠的混合数据以扩展数据集的多样性和规模,多尺度注视感知(MGP)模块用于提取网络的多尺度感知。设计了一种注视损失来使模型的感知与人类注视对齐。我们在多个不同模态的数据集上验证了HG-DTGL,并在总共十个不同的器官/组织上取得了优异的性能,进行了广泛的实验。这表明我们的方法对不同模态的医学图像具有很强的泛化能力,并展示了注视数据在半监督医学图像分割中的巨大应用潜力。