用于手术场景分割的基于类别感知语义扩散模型的图像合成

Image synthesis with class-aware semantic diffusion models for surgical scene segmentation.

作者信息

Zhou Yihang, Towning Rebecca, Awad Zaid, Giannarou Stamatia

机构信息

Hamlyn Centre for Robotic Surgery, Department of Surgery and Cancer Imperial College London London UK.

Imperial College Healthcare NHS Trust London UK.

出版信息

Healthc Technol Lett. 2025 Jan 31;12(1):e70003. doi: 10.1049/htl2.70003. eCollection 2025 Jan-Dec.

DOI:10.1049/htl2.70003

PMID:39897096

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11783686/

Abstract

Surgical scene segmentation is essential for enhancing surgical precision, yet it is frequently compromised by the scarcity and imbalance of available data. To address these challenges, semantic image synthesis methods based on generative adversarial networks and diffusion models have been developed. However, these models often yield non-diverse images and fail to capture small, critical tissue classes, limiting their effectiveness. In response, a class-aware semantic diffusion model (CASDM), a novel approach which utilizes segmentation maps as conditions for image synthesis to tackle data scarcity and imbalance is proposed. Novel class-aware mean squared error and class-aware self-perceptual loss functions have been defined to prioritize critical, less visible classes, thereby enhancing image quality and relevance. Furthermore, to the authors' knowledge, they are the first to generate multi-class segmentation maps using text prompts in a novel fashion to specify their contents. These maps are then used by CASDM to generate surgical scene images, enhancing datasets for training and validating segmentation models. This evaluation assesses both image quality and downstream segmentation performance, demonstrates the strong effectiveness and generalisability of CASDM in producing realistic image-map pairs, significantly advancing surgical scene segmentation across diverse and challenging datasets.

摘要

手术场景分割对于提高手术精度至关重要，但它经常因可用数据的稀缺和不平衡而受到影响。为应对这些挑战，基于生成对抗网络和扩散模型的语义图像合成方法已被开发出来。然而，这些模型往往生成的图像缺乏多样性，并且无法捕捉到小的关键组织类别，从而限制了它们的有效性。作为回应，提出了一种类感知语义扩散模型（CASDM），这是一种新颖的方法，它利用分割图作为图像合成的条件来解决数据稀缺和不平衡问题。定义了新颖的类感知均方误差和类感知自感知损失函数，以优先处理关键的、不太明显的类别，从而提高图像质量和相关性。此外，据作者所知，他们首次以新颖的方式使用文本提示生成多类分割图来指定其内容。然后，CASDM使用这些分割图来生成手术场景图像，增强用于训练和验证分割模型的数据集。该评估同时评估了图像质量和下游分割性能，证明了CASDM在生成逼真的图像-分割图对方面的强大有效性和通用性，显著推进了跨不同且具有挑战性的数据集的手术场景分割。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae66/11783686/18743ff184af/HTL2-12-e70003-g005.jpg

相似文献

Image synthesis with class-aware semantic diffusion models for surgical scene segmentation.

Healthc Technol Lett. 2025 Jan 31;12(1):e70003. doi: 10.1049/htl2.70003. eCollection 2025 Jan-Dec.

Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.

Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.

Generative multi-adversarial network for striking the right balance in abdominal image segmentation.

Int J Comput Assist Radiol Surg. 2020 Nov;15(11):1847-1858. doi: 10.1007/s11548-020-02254-4. Epub 2020 Sep 8.

Advancements in urban scene segmentation using deep learning and generative adversarial networks for accurate satellite image analysis.

PLoS One. 2024 Jul 18;19(7):e0307187. doi: 10.1371/journal.pone.0307187. eCollection 2024.

Semi-Supervised Semantic Image Segmentation by Deep Diffusion Models and Generative Adversarial Networks.

Int J Neural Syst. 2024 Nov;34(11):2450057. doi: 10.1142/S0129065724500576. Epub 2024 Aug 15.

Label-informed cardiac magnetic resonance image synthesis through conditional generative adversarial networks.

Comput Med Imaging Graph. 2022 Oct;101:102123. doi: 10.1016/j.compmedimag.2022.102123. Epub 2022 Sep 11.

Guided image generation for improved surgical image segmentation.

Med Image Anal. 2024 Oct;97:103263. doi: 10.1016/j.media.2024.103263. Epub 2024 Jul 3.

Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks.

Sensors (Basel). 2022 Oct 29;22(21):8306. doi: 10.3390/s22218306.

Semantic hyperspectral image synthesis for cross-modality knowledge transfer in surgical data science.

Int J Comput Assist Radiol Surg. 2025 Apr 24. doi: 10.1007/s11548-025-03364-7.

Class-Aware Adversarial Transformers for Medical Image Segmentation.

Adv Neural Inf Process Syst. 2022 Dec;35:29582-29596.

本文引用的文献

Anatomy segmentation in laparoscopic surgery: comparison of machine learning and human expertise - an experimental study.

Int J Surg. 2023 Oct 1;109(10):2962-2974. doi: 10.1097/JS9.0000000000000595.

Unsupervised Medical Image Translation With Adversarial Diffusion Models.

IEEE Trans Med Imaging. 2023 Dec;42(12):3524-3539. doi: 10.1109/TMI.2023.3290149. Epub 2023 Nov 30.

Diffusion models in medical imaging: A comprehensive survey.

Med Image Anal. 2023 Aug;88:102846. doi: 10.1016/j.media.2023.102846. Epub 2023 May 23.

Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy.

Med Image Anal. 2021 May;70:102002. doi: 10.1016/j.media.2021.102002. Epub 2021 Feb 17.

Artificial Intelligence for Intraoperative Guidance: Using Semantic Segmentation to Identify Surgical Anatomy During Laparoscopic Cholecystectomy.

Ann Surg. 2022 Aug 1;276(2):363-369. doi: 10.1097/SLA.0000000000004594. Epub 2020 Nov 13.

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos.

IEEE Trans Med Imaging. 2017 Jan;36(1):86-97. doi: 10.1109/TMI.2016.2593957. Epub 2016 Jul 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于手术场景分割的基于类别感知语义扩散模型的图像合成

Image synthesis with class-aware semantic diffusion models for surgical scene segmentation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献