Suppr超能文献

WS-SfMLearner:在相机参数未知的手术视频上进行自监督单目深度和自我运动估计。

WS-SfMLearner: self-supervised monocular depth and ego-motion estimation on surgical videos with unknown camera parameters.

作者信息

Lou Ange, Noble Jack

机构信息

Vanderbilt University, Department of Electrical and Computer Engineering, Nashville, Tennessee, United States.

出版信息

J Med Imaging (Bellingham). 2025 Mar;12(2):025003. doi: 10.1117/1.JMI.12.2.025003. Epub 2025 Apr 30.

Abstract

PURPOSE

Accurate depth estimation in surgical videos is a pivotal component of numerous image-guided surgery procedures. However, creating ground truth depth maps for surgical videos is often infeasible due to challenges such as inconsistent illumination and sensor noise. As a result, self-supervised depth and ego-motion estimation frameworks are gaining traction, eliminating the need for manually annotated depth maps. Despite the progress, current self-supervised methods still rely on known camera intrinsic parameters, which are frequently unavailable or unrecorded in surgical environments. We address this gap by introducing a self-supervised system capable of jointly predicting depth maps, camera poses, and intrinsic parameters, providing a comprehensive solution for depth estimation under such constraints.

APPROACH

We developed a self-supervised depth and ego-motion estimation framework, incorporating a cost volume-based auxiliary supervision module. This module provides additional supervision for predicting camera intrinsic parameters, allowing for robust estimation even without predefined intrinsics. The system was rigorously evaluated on a public dataset to assess its effectiveness in simultaneously predicting depth, camera pose, and intrinsic parameters.

RESULTS

The experimental results demonstrated that the proposed method significantly improved the accuracy of ego-motion and depth prediction, even when compared with methods incorporating known camera intrinsics. In addition, by integrating our cost volume-based supervision, the accuracy of camera parameter estimation, including intrinsic parameters, was further enhanced.

CONCLUSIONS

We present a self-supervised system for depth, ego-motion, and intrinsic parameter estimation, effectively overcoming the limitations imposed by unknown or missing camera intrinsics. The experimental results confirm that the proposed method outperforms the baseline techniques, offering a robust solution for depth estimation in complex surgical video scenarios, with broader implications for improving image-guided surgery systems.

摘要

目的

手术视频中的准确深度估计是众多图像引导手术程序的关键组成部分。然而,由于光照不一致和传感器噪声等挑战,为手术视频创建真实深度图往往不可行。因此,自监督深度和自我运动估计框架越来越受到关注,无需手动标注深度图。尽管取得了进展,但当前的自监督方法仍然依赖已知的相机内参,而在手术环境中这些参数常常不可用或未记录。我们通过引入一个能够联合预测深度图、相机姿态和内参的自监督系统来解决这一差距,为在这种约束下的深度估计提供了一个全面的解决方案。

方法

我们开发了一个自监督深度和自我运动估计框架,纳入了基于代价体的辅助监督模块。该模块为预测相机内参提供额外监督,即使没有预定义的内参也能进行稳健估计。该系统在一个公共数据集上进行了严格评估,以评估其在同时预测深度、相机姿态和内参方面的有效性。

结果

实验结果表明,即使与包含已知相机内参的方法相比,所提出的方法也显著提高了自我运动和深度预测的准确性。此外,通过整合我们基于代价体的监督,包括内参在内的相机参数估计的准确性进一步提高。

结论

我们提出了一个用于深度、自我运动和内参估计的自监督系统,有效克服了未知或缺失相机内参带来的限制。实验结果证实,所提出的方法优于基线技术,为复杂手术视频场景中的深度估计提供了一个稳健的解决方案,对改进图像引导手术系统具有更广泛的意义。

相似文献

引用本文的文献

本文引用的文献

2
Segment anything in medical images.在医学图像中分割任何内容。
Nat Commun. 2024 Jan 22;15(1):654. doi: 10.1038/s41467-024-44824-z.
7
Deep Ordinal Regression Network for Monocular Depth Estimation.用于单目深度估计的深度序数回归网络
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2018 Jun;2018:2002-2011. doi: 10.1109/CVPR.2018.00214. Epub 2018 Dec 17.
8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验