IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3779-3790. doi: 10.1109/TPAMI.2021.3058410. Epub 2022 Jun 3.
Among the greatest of the challenges of minimally invasive surgery (MIS) is the inadequate visualisation of the surgical field through keyhole incisions. Moreover, occlusions caused by instruments or bleeding can completely obfuscate anatomical landmarks, reduce surgical vision and lead to iatrogenic injury. The aim of this paper is to propose an unsupervised end-to-end deep learning framework, based on fully convolutional neural networks to reconstruct the view of the surgical scene under occlusions and provide the surgeon with intraoperative see-through vision in these areas. A novel generative densely connected encoder-decoder architecture has been designed which enables the incorporation of temporal information by introducing a new type of 3D convolution, the so called 3D partial convolution, to enhance the learning capabilities of the network and fuse temporal and spatial information. To train the proposed framework, a unique loss function has been proposed which combines feature matching, reconstruction, style, temporal and adversarial loss terms, for generating high fidelity image reconstructions. Advancing the state-of-the-art, our method can reconstruct the underlying view obstructed by irregularly shaped occlusions of divergent size, location and orientation. The proposed method has been validated on in vivo MIS video data, as well as natural scenes on a range of occlusion-to-image (OIR) ratios. It has also been compared against the latest video inpainting models in terms of image reconstruction quality using different assessment metrics. The performance evaluation analysis verifies the superiority of our proposed method and its potential clinical value.
微创手术 (MIS) 面临的最大挑战之一是通过小孔切口对手术区域的可视化效果不理想。此外,器械或出血引起的遮挡物可能会完全模糊解剖学标志,降低手术视野并导致医源性损伤。本文旨在提出一种基于全卷积神经网络的无监督端到端深度学习框架,用于重建遮挡下手术场景的视图,并为外科医生在这些区域提供术中透视视觉。设计了一种新颖的生成式密集连接编码器-解码器架构,通过引入一种新的 3D 卷积,即所谓的 3D 偏卷积,引入了时间信息,以增强网络的学习能力并融合时间和空间信息。为了训练所提出的框架,提出了一种独特的损失函数,该函数结合了特征匹配、重建、风格、时间和对抗性损失项,以生成高保真的图像重建。通过将不规则形状的遮挡物的大小、位置和方向进行扩展,我们的方法可以对被遮挡的底层视图进行重建。所提出的方法已经在体内 MIS 视频数据以及一系列遮挡到图像 (OIR) 比的自然场景上进行了验证。还使用不同的评估指标比较了与最新视频修复模型在图像重建质量方面的性能。性能评估分析验证了我们提出的方法的优越性及其潜在的临床价值。