Department of Translational Surgical Oncology, National Centre for Tumor Diseases(NCT/UCC), Dresden, 01307, Germany.
SECAI, TU Dresden, Dresden, Germany.
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):985-993. doi: 10.1007/s11548-024-03079-1. Epub 2024 Feb 26.
In surgical computer vision applications, data privacy and expert annotation challenges impede the acquisition of labeled training data. Unpaired image-to-image translation techniques have been explored to automatically generate annotated datasets by translating synthetic images into a realistic domain. The preservation of structure and semantic consistency, i.e., per-class distribution during translation, poses a significant challenge, particularly in cases of semantic distributional mismatch.
This study empirically investigates various translation methods for generating data in surgical applications, explicitly focusing on semantic consistency. Through our analysis, we introduce a novel and simple combination of effective approaches, which we call ConStructS. The defined losses within this approach operate on multiple image patches and spatial resolutions during translation.
Various state-of-the-art models were extensively evaluated on two challenging surgical datasets. With two different evaluation schemes, the semantic consistency and the usefulness of the translated images on downstream semantic segmentation tasks were evaluated. The results demonstrate the effectiveness of the ConStructS method in minimizing semantic distortion, with images generated by this model showing superior utility for downstream training.
In this study, we tackle semantic inconsistency in unpaired image translation for surgical applications with minimal labeled data. The simple model (ConStructS) enhances consistency during translation and serves as a practical way of generating fully labeled and semantically consistent datasets at minimal cost. Our code is available at https://gitlab.com/nct_tso_public/constructs .
在外科计算机视觉应用中,数据隐私和专家注释挑战阻碍了有标签训练数据的获取。人们探索了未配对的图像到图像翻译技术,通过将合成图像转换为逼真的域来自动生成带注释的数据集。在翻译过程中,结构和语义一致性的保留,即每个类别的分布,是一个重大挑战,尤其是在语义分布不匹配的情况下。
本研究对外科应用中生成数据的各种翻译方法进行了实证研究,特别关注语义一致性。通过我们的分析,我们引入了一种新颖而简单的有效方法组合,我们称之为 ConStructS。该方法中的定义损失在翻译过程中针对多个图像补丁和空间分辨率进行操作。
在两个具有挑战性的外科数据集上,对各种最先进的模型进行了广泛评估。通过两种不同的评估方案,评估了语义一致性和翻译图像在下游语义分割任务中的有用性。结果表明,ConStructS 方法在最小化语义失真方面非常有效,该模型生成的图像在下游训练中具有更好的实用性。
在这项研究中,我们针对具有最小有标签数据的外科应用中的未配对图像翻译中的语义不一致性问题进行了研究。简单的模型(ConStructS)增强了翻译过程中的一致性,并为以最小成本生成完全标记和语义一致的数据集提供了一种实用方法。我们的代码可在 https://gitlab.com/nct_tso_public/constructs 获得。