Shi Yuxuan, Wang Hong, Ji Haoqin, Liu Haozhe, Li Yuexiang, He Nanjun, Wei Dong, Huang Yawen, Dai Qi, Wu Jianrong, Chen Xinrong, Zheng Yefeng, Yu Hongmeng
ENT Institute and Department of Otolaryngology, Eye & ENT Hospital, Fudan University, Shanghai, 200031, China.
Tencent Jarvis Lab, Shenzhen 518000, China.
Med Image Anal. 2023 Dec;90:102973. doi: 10.1016/j.media.2023.102973. Epub 2023 Sep 20.
In the field of medical image analysis, accurate lesion segmentation is beneficial for the subsequent clinical diagnosis and treatment planning. Currently, various deep learning-based methods have been proposed to deal with the segmentation task. Albeit achieving some promising performances, the fully-supervised learning approaches require pixel-level annotations for model training, which is tedious and time-consuming for experienced radiologists to collect. In this paper, we propose a weakly semi-supervised segmentation framework, called Point Segmentation Transformer (Point SEGTR). Particularly, the framework utilizes a small amount of fully-supervised data with pixel-level segmentation masks and a large amount of weakly-supervised data with point-level annotations (i.e., annotating a point inside each object) for network training, which largely reduces the demand of pixel-level annotations significantly. To fully exploit the pixel-level and point-level annotations, we propose two regularization terms, i.e., multi-point consistency and symmetric consistency, to boost the quality of pseudo labels, which are then adopted to train a student model for inference. Extensive experiments are conducted on three endoscopy datasets with different lesion structures and several body sites (e.g., colorectal and nasopharynx). Comprehensive experimental results finely substantiate the effectiveness and the generality of our proposed method, as well as its potential to loosen the requirements of pixel-level annotations, which is valuable for clinical applications.
在医学图像分析领域,准确的病变分割有助于后续的临床诊断和治疗规划。目前,已经提出了各种基于深度学习的方法来处理分割任务。尽管取得了一些令人鼓舞的性能,但全监督学习方法需要用于模型训练的像素级注释,这对于经验丰富的放射科医生来说收集起来既繁琐又耗时。在本文中,我们提出了一种弱半监督分割框架,称为点分割变换器(Point SEGTR)。具体而言,该框架利用少量带有像素级分割掩码的全监督数据和大量带有点级注释(即在每个对象内部注释一个点)的弱监督数据进行网络训练,这大大降低了对像素级注释的需求。为了充分利用像素级和点级注释,我们提出了两个正则化项,即多点一致性和对称一致性,以提高伪标签的质量,然后采用这些伪标签来训练一个学生模型进行推理。我们在三个具有不同病变结构和多个身体部位(如结肠直肠和鼻咽)的内窥镜数据集上进行了广泛的实验。综合实验结果充分证实了我们提出的方法的有效性和通用性,以及它放宽像素级注释要求的潜力,这对于临床应用具有重要价值。