CNRS, ICube, University of Strasbourg, Strasbourg, France.
IHU Strasbourg, Strasbourg, France.
Int J Comput Assist Radiol Surg. 2022 Aug;17(8):1469-1476. doi: 10.1007/s11548-022-02629-9. Epub 2022 Apr 26.
Semantic segmentation and activity classification are key components to create intelligent surgical systems able to understand and assist clinical workflow. In the operating room, semantic segmentation is at the core of creating robots aware of clinical surroundings, whereas activity classification aims at understanding OR workflow at a higher level. State-of-the-art semantic segmentation and activity recognition approaches are fully supervised, which is not scalable. Self-supervision can decrease the amount of annotated data needed.
We propose a new 3D self-supervised task for OR scene understanding utilizing OR scene images captured with ToF cameras. Contrary to other self-supervised approaches, where handcrafted pretext tasks are focused on 2D image features, our proposed task consists of predicting relative 3D distance of image patches by exploiting the depth maps. By learning 3D spatial context, it generates discriminative features for our downstream tasks.
Our approach is evaluated on two tasks and datasets containing multiview data captured from clinical scenarios. We demonstrate a noteworthy improvement in performance on both tasks, specifically on low-regime data where utility of self-supervised learning is the highest.
We propose a novel privacy-preserving self-supervised approach utilizing depth maps. Our proposed method shows performance on par with other self-supervised approaches and could be an interesting way to alleviate the burden of full supervision.
语义分割和活动分类是创建能够理解和辅助临床工作流程的智能手术系统的关键组成部分。在手术室中,语义分割是创建能够感知临床环境的机器人的核心,而活动分类旨在更高层次上理解手术室工作流程。最先进的语义分割和活动识别方法是完全监督的,这是不可扩展的。自监督可以减少所需的标注数据量。
我们提出了一种新的 3D 自监督任务,用于利用飞行时间 (ToF) 相机捕获的手术室场景理解。与其他自监督方法不同,这些方法的手工制作的预备任务侧重于 2D 图像特征,我们提出的任务包括通过利用深度图来预测图像补丁的相对 3D 距离。通过学习 3D 空间上下文,为我们的下游任务生成有区别的特征。
我们的方法在两个包含从临床场景捕获的多视图数据的任务和数据集上进行了评估。我们在两个任务上都展示了显著的性能提升,特别是在自我监督学习最有用的低数据量情况下。
我们提出了一种新的利用深度图的隐私保护自监督方法。我们提出的方法在性能上与其他自监督方法相当,可能是减轻完全监督负担的一种有趣方式。