College of Intelligence and Computing, Tianjin University, Tianjin, China; Tianjin Key Laboratory of Advanced Networking, Tianjin University, Tianjin, China; Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, China.
Tianjin Medical University General Hospital, Tianjin Medical University, Tianjin, China.
Ultrasound Med Biol. 2023 Sep;49(9):1940-1950. doi: 10.1016/j.ultrasmedbio.2023.04.009. Epub 2023 Jun 11.
The main objective of the work described here was to train a semantic segmentation model using classification data for thyroid nodule ultrasound images to reduce the pressure of obtaining pixel-level labeled data sets. Furthermore, we improved the segmentation performance of the model by mining the image information to narrow the gap between weakly supervised semantic segmentation (WSSS) and fully supervised semantic segmentation.
Most WSSS methods use a class activation map (CAM) to generate segmentation results. However, the lack of supervision information makes it difficult for a CAM to highlight the object region completely. Therefore, we here propose a novel foreground and background pair (FB-Pair) representation method, which consists of high- and low-response regions highlighted by the original CAM-generated online in the original image. During training, the original CAM is revised using the CAM generated by the FB-Pair. In addition, we design a self-supervised learning pretext task based on FB-Pair, which requires the model to predict whether the pixels in FB-Pair are from the original image during training. After this task, the model will accurately distinguish between different categories of objects.
Experiments on the thyroid nodule ultrasound image (TUI) data set revealed that our proposed method outperformed existing methods, with a 5.7% improvement in the mean intersection-over-union (mIoU) performance of segmentation compared with the second-best method and a reduction to 2.9% in the difference between the performance of benign and malignant nodules.
Our method trains a well-performing segmentation model on ultrasound images of thyroid nodules using only classification data. In addition, we determined that CAM can take full advantage of the information in the images to highlight the target regions more accurately and thus improve the segmentation performance.
本研究旨在利用分类数据训练甲状腺结节超声图像的语义分割模型,以减轻获取像素级标注数据集的压力。此外,我们通过挖掘图像信息来缩小弱监督语义分割(WSSS)和完全监督语义分割之间的差距,从而提高模型的分割性能。
大多数 WSSS 方法使用类激活映射(CAM)生成分割结果。然而,由于缺乏监督信息,CAM 很难完全突出目标区域。因此,我们提出了一种新颖的前景和背景对(FB-Pair)表示方法,该方法由原始 CAM 生成的在线高、低响应区域组成。在训练过程中,原始 CAM 使用由 FB-Pair 生成的 CAM 进行修正。此外,我们基于 FB-Pair 设计了一个自监督学习的前置任务,要求模型在训练过程中预测 FB-Pair 中的像素是否来自原始图像。经过这个任务后,模型将能够准确地区分不同类别的对象。
在甲状腺结节超声图像(TUI)数据集上的实验表明,我们提出的方法优于现有的方法,与第二好的方法相比,分割的平均交并比(mIoU)性能提高了 5.7%,良性和恶性结节的性能差异降低到 2.9%。
我们的方法仅使用分类数据在甲状腺结节超声图像上训练了性能良好的分割模型。此外,我们确定 CAM 可以充分利用图像中的信息,更准确地突出目标区域,从而提高分割性能。