Zhao Xinkai, Hayashi Yuichiro, Oda Masahiro, Kitasaka Takayuki, Mori Kensaku
Graduate School of Informatics, Nagoya University, Furo-cho, Chikusaku, Nagoya, Aichi, Japan.
Information Technology Center, Nagoya University, Furo-cho, Chikusaku, Nagoya, Aichi, Japan.
Int J Comput Assist Radiol Surg. 2025 Jul;20(7):1551-1560. doi: 10.1007/s11548-025-03405-1. Epub 2025 Jun 14.
Understanding anatomical structures in laparoscopic images is crucial for various types of laparoscopic surgery. However, creating specialized datasets for each type is both inefficient and challenging. This highlights the clinical significance of exploring class-incremental semantic segmentation (CISS) for laparoscopic images. Although CISS has been widely studied in diverse image datasets, in clinical settings, incremental data typically consists of new patient images rather than reusing previous images, necessitating a novel algorithm.
We introduce a distillation approach driven by a diffusion model for CISS of laparoscopic images. Specifically, an unconditional diffusion model is trained to generate synthetic laparoscopic images, which are then incorporated into subsequent training steps. A distillation network is employed to extract and transfer knowledge from networks trained in earlier steps. Additionally, to address the challenge posed by the limited semantic information available in individual laparoscopic images, we employ cross-image contrastive learning, enhancing the model's ability to distinguish subtle variations across images.
Our method was trained and evaluated on all 11 anatomical structures from the Dresden Surgical Anatomy Dataset, which presents significant challenges due to its dispersed annotations. Extensive experiments demonstrate that our approach outperforms other methods, especially in difficult categories such as the ureter and vesicular glands, where it surpasses even supervised offline learning.
This study is the first to address class-incremental semantic segmentation for laparoscopic images, significantly improving the adaptability of segmentation models to new anatomical classes in surgical procedures.
了解腹腔镜图像中的解剖结构对于各类腹腔镜手术至关重要。然而,为每种类型创建专门的数据集既低效又具有挑战性。这凸显了探索腹腔镜图像的类增量语义分割(CISS)的临床意义。尽管CISS已在各种图像数据集中得到广泛研究,但在临床环境中,增量数据通常由新的患者图像组成,而非重复使用先前的图像,因此需要一种新颖的算法。
我们引入一种由扩散模型驱动的蒸馏方法用于腹腔镜图像的CISS。具体而言,训练一个无条件扩散模型以生成合成腹腔镜图像,然后将其纳入后续的训练步骤。采用一个蒸馏网络从早期步骤训练的网络中提取并提取并转移知识。此外,为应对单个腹腔镜图像中可用语义信息有限所带来的挑战,我们采用跨图像对比学习,增强模型区分图像间细微差异的能力。
我们的方法在德累斯顿外科解剖数据集的所有11种解剖结构上进行了训练和评估,该数据集由于其分散的注释而带来了重大挑战。大量实验表明,我们的方法优于其他方法,特别是在输尿管和精囊等困难类别中,甚至超过了有监督的离线学习。
本研究首次针对腹腔镜图像解决类增量语义分割问题,显著提高了分割模型在手术过程中对新解剖类别的适应性。