Song Changshun, Peng Jun, Chen Qi, Qin Zhibao, Tai Yonghang
Yunnan Key Lab Optoelectronic Information Technology, Yunnan Normal University, Kunming 650500, China; Yunnan Province Photoelectric Detection and Perception Technology Engineering Research Center, China.
Department of Thoracic Surgery, Institute of The First People's Hospital of Yunnan Province, Kunming 650032, China.
Comput Methods Programs Biomed. 2025 Aug;268:108856. doi: 10.1016/j.cmpb.2025.108856. Epub 2025 May 12.
Robot-assisted surgery has revolutionized modern medical procedures by enhancing precision, reducing invasiveness, and providing a clearer, more controlled environment. However, it still faces challenges in fully visualizing the target tissue, particularly from multiple perspectives. This limitation is most evident in minimally invasive surgeries, Therefore, the ability to synthesize new views of the surgical scene is becoming increasingly critical. By generating multi-view visualizations, surgeons can gain a more comprehensive understanding of the target tissue, improving spatial awareness and decision-making during surgery.
This article proposes an innovative novel view synthesis method for robotic surgical scenarios, which utilizes pre-trained depth estimation model to obtain global depth information and solves the scale ambiguity problem encountered in the transition region of the Gaussian distribution in the 3D Gaussian model. In addition, we introduce a multi-scale loss optimization strategy that captures features of various scales through a multi-scale loss function to regularize the Gaussian parameters while maintaining the 3D consistency of the splatting.
Our method is evaluated against current scene novel view synthesis techniques using our robotic surgery scene dataset, along with the Hamlyn and Stereo MIS public datasets. The proposed approach achieved an average PSNR of 33.45, SSIM of 0.939, LPIPS of 0.153, and an RMSE of 0.022 across the three datasets.
Our approach helps to enhance the visualization capabilities of robotic surgical systems by synthesizing novel views of surgical scenes. A deeper understanding of the target tissue will enhance patient safety during surgery and support surgeon training. These advancements will contribute to the improvement of robot-assisted surgery, making it more adaptable to diverse clinical scenarios.
机器人辅助手术通过提高精度、减少侵入性并提供更清晰、更可控的环境,彻底改变了现代医疗程序。然而,它在完全可视化目标组织方面仍面临挑战,尤其是从多个视角。这种局限性在微创手术中最为明显,因此,合成手术场景新视图的能力变得越来越关键。通过生成多视图可视化,外科医生可以更全面地了解目标组织,提高手术中的空间意识和决策能力。
本文提出了一种用于机器人手术场景的创新新视图合成方法,该方法利用预训练的深度估计模型获取全局深度信息,并解决了三维高斯模型中高斯分布过渡区域遇到的尺度模糊问题。此外,我们引入了一种多尺度损失优化策略,通过多尺度损失函数捕捉各种尺度的特征,以在保持拼贴的三维一致性的同时对高斯参数进行正则化。
我们使用机器人手术场景数据集以及哈姆林和立体微创外科公共数据集,针对当前场景新视图合成技术对我们的方法进行了评估。在所提出的方法在三个数据集中实现了平均峰值信噪比为33.45、结构相似性指数为0.939、学习感知图像补丁相似度为0.153,均方根误差为0.022。
我们的方法通过合成手术场景的新视图有助于增强机器人手术系统的可视化能力。对目标组织更深入的了解将提高手术期间的患者安全性,并支持外科医生培训。这些进展将有助于改进机器人辅助手术,使其更适应各种临床场景。