Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany.
Int J Comput Assist Radiol Surg. 2023 Jul;18(7):1217-1224. doi: 10.1007/s11548-023-02925-y. Epub 2023 May 23.
Image-to-image translation methods can address the lack of diversity in publicly available cataract surgery data. However, applying image-to-image translation to videos-which are frequently used in medical downstream applications-induces artifacts. Additional spatio-temporal constraints are needed to produce realistic translations and improve the temporal consistency of translated image sequences.
We introduce a motion-translation module that translates optical flows between domains to impose such constraints. We combine it with a shared latent space translation model to improve image quality. Evaluations are conducted regarding translated sequences' image quality and temporal consistency, where we propose novel quantitative metrics for the latter. Finally, the downstream task of surgical phase classification is evaluated when retraining it with additional synthetic translated data.
Our proposed method produces more consistent translations than state-of-the-art baselines. Moreover, it stays competitive in terms of the per-image translation quality. We further show the benefit of consistently translated cataract surgery sequences for improving the downstream task of surgical phase prediction.
The proposed module increases the temporal consistency of translated sequences. Furthermore, imposed temporal constraints increase the usability of translated data in downstream tasks. This allows overcoming some of the hurdles of surgical data acquisition and annotation and enables improving models' performance by translating between existing datasets of sequential frames.
图像到图像的翻译方法可以解决公共白内障手术数据缺乏多样性的问题。然而,将图像到图像的翻译应用于视频中——这在医疗下游应用中经常使用——会产生伪影。需要额外的时空约束来生成逼真的翻译并提高翻译图像序列的时间一致性。
我们引入了一个运动翻译模块,该模块可以在域之间翻译光流以施加这种约束。我们将其与共享潜在空间翻译模型相结合,以提高图像质量。评估是关于翻译序列的图像质量和时间一致性进行的,我们为此提出了新的定量指标。最后,当使用额外的合成翻译数据进行重新训练时,评估手术阶段分类的下游任务。
与最先进的基线相比,我们提出的方法产生了更一致的翻译。此外,它在逐图像翻译质量方面具有竞争力。我们进一步展示了一致翻译的白内障手术序列在提高手术阶段预测下游任务方面的好处。
所提出的模块提高了翻译序列的时间一致性。此外,施加的时间约束增加了翻译数据在下游任务中的可用性。这允许克服手术数据采集和注释的一些障碍,并通过在现有序列帧数据集之间进行翻译来提高模型的性能。