Sharifian Rasoul, Rabbani Navid, Bartoli Adrien
DIA2M, DRCI, CHU, Clermont-Ferrand, France.
Institut Pascal, UMR6602 CNRS/UCA, Clermont-Ferrand, France.
Int J Comput Assist Radiol Surg. 2025 Jun;20(6):1215-1229. doi: 10.1007/s11548-025-03375-4. Epub 2025 Apr 26.
Monocular Single-shot Depth Estimation (MoSDE) methods for Minimally-Invasive Surgery (MIS) are promising but their robustness in surgical conditions remains questionable. We introduce the RoDEM benchmark, comprising an advanced analysis of perturbations, a dataset acquired in realistic MIS conditions and metrics. The dataset consists of 29,803 ex-vivo images including 44 video sequences with depth Ground-Truth covering clean conditions and nine perturbations. We give the performance evaluation of nine existing MoSDE methods.
An RGB-D structured-light camera was firmly attached to a laparoscope. The two cameras were internally calibrated and the rigid transformation between them was estimated. Synchronised images and videos were captured while producing real perturbations in three settings. The depth maps were eventually transferred to the laparoscope viewpoint and the images categorised by perturbation severity.
The proposed metrics cover accuracy (clean condition performance) and robustness (resilience to perturbations). We found that foundation models demonstrated higher accuracy than the other methods. All methods were robust to motion blur and bright light. Methods trained on large datasets were robust against smoke, blood, and low light whereas the other methods exhibited reduced robustness. None of the methods coped with lens dirtiness and defocus blur.
This study highlighted the importance of robustness evaluation in MoSDE as many existing methods showed reduced accuracy against common surgical perturbations. It emphasises the importance of training with large datasets including perturbations. The proposed benchmark gives a precise and detailed analysis of a method's performance in the MIS conditions. It will be made publicly available.
用于微创手术(MIS)的单目单次深度估计(MoSDE)方法很有前景,但其在手术条件下的鲁棒性仍值得怀疑。我们引入了RoDEM基准,包括对扰动的高级分析、在现实的MIS条件下获取的数据集和指标。该数据集由29,803张离体图像组成,包括44个视频序列,其深度真值涵盖了清洁条件和九种扰动。我们给出了九种现有MoSDE方法的性能评估。
将一台RGB-D结构光相机牢固地连接到腹腔镜上。对两台相机进行内部校准,并估计它们之间的刚性变换。在三种设置下产生真实扰动的同时,同步采集图像和视频。最终将深度图转换到腹腔镜视角,并按扰动严重程度对图像进行分类。
所提出的指标涵盖了准确性(清洁条件下的性能)和鲁棒性(对扰动的恢复能力)。我们发现基础模型的准确性高于其他方法。所有方法对运动模糊和强光都具有鲁棒性。在大型数据集上训练的方法对烟雾、血液和低光具有鲁棒性,而其他方法的鲁棒性则有所降低。没有一种方法能应对镜头脏污和散焦模糊。
本研究强调了在MoSDE中进行鲁棒性评估的重要性,因为许多现有方法在面对常见手术扰动时准确性会降低。它强调了使用包括扰动在内的大型数据集进行训练的重要性。所提出的基准对方法在MIS条件下的性能进行了精确而详细的分析。它将公开提供。