Suppr超能文献

基于时空层的术中立体深度估计网络,通过分层预测和渐进式训练。

Spatio-temporal layers based intra-operative stereo depth estimation network via hierarchical prediction and progressive training.

机构信息

Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy.

Politecnico di Milano, Department of Electronics, Information and Bioengineering, Milano, 20133, Italy.

出版信息

Comput Methods Programs Biomed. 2024 Feb;244:107937. doi: 10.1016/j.cmpb.2023.107937. Epub 2023 Nov 22.

Abstract

BACKGROUND AND OBJECTIVE

Safety of robotic surgery can be enhanced through augmented vision or artificial constraints to the robotl motion, and intra-operative depth estimation is the cornerstone of these applications because it provides precise position information of surgical scenes in 3D space. High-quality depth estimation of endoscopic scenes has been a valuable issue, and the development of deep learning provides more possibility and potential to address this issue.

METHODS

In this paper, a deep learning-based approach is proposed to recover 3D information of intra-operative scenes. To this aim, a fully 3D encoder-decoder network integrating spatio-temporal layers is designed, and it adopts hierarchical prediction and progressive learning to enhance prediction accuracy and shorten training time.

RESULTS

Our network gets the depth estimation accuracy of MAE 2.55±1.51 (mm) and RMSE 5.23±1.40 (mm) using 8 surgical videos with a resolution of 1280×1024, which performs better compared with six other state-of-the-art methods that were trained on the same data.

CONCLUSIONS

Our network can implement a promising depth estimation performance in intra-operative scenes using stereo images, allowing the integration in robot-assisted surgery to enhance safety.

摘要

背景与目的

通过增强视觉或对机器人运动施加人为限制,可以提高机器人手术的安全性,术中深度估计是这些应用的基础,因为它提供了三维空间中手术场景的精确位置信息。高质量的内窥镜场景深度估计一直是一个有价值的问题,深度学习的发展为解决这个问题提供了更多的可能性和潜力。

方法

本文提出了一种基于深度学习的方法来恢复手术场景中的三维信息。为此,设计了一个完全的 3D 编解码器网络,集成了时空层,它采用分层预测和渐进式学习来提高预测精度和缩短训练时间。

结果

我们的网络使用分辨率为 1280×1024 的 8 个手术视频,得到 MAE 为 2.55±1.51(mm)和 RMSE 为 5.23±1.40(mm)的深度估计精度,与在相同数据上训练的其他 6 种最先进的方法相比,表现更好。

结论

我们的网络可以使用立体图像实现术中场景有前景的深度估计性能,允许集成到机器人辅助手术中以提高安全性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验