Yu James, Pruitt Kelden, Nawawithan Nati, Johnson Brett A, Gahan Jeffrey, Fei Baowei
Center for Imaging and Surgical Innovation, University of Texas at Dallas, Richardson, TX.
Department of Radiology, University of Texas Southwestern Medical Center, Dallas, TX.
Proc SPIE Int Soc Opt Eng. 2024 Feb;12928. doi: 10.1117/12.3008768. Epub 2024 Mar 29.
Augmented reality (AR) has seen increased interest and attention for its application in surgical procedures. AR-guided surgical systems can overlay segmented anatomy from pre-operative imaging onto the user's environment to delineate hard-to-see structures and subsurface lesions intraoperatively. While previous works have utilized pre-operative imaging such as computed tomography or magnetic resonance images, registration methods still lack the ability to accurately register deformable anatomical structures without fiducial markers across modalities and dimensionalities. This is especially true of minimally invasive abdominal surgical techniques, which often employ a monocular laparoscope, due to inherent limitations. Surgical scene reconstruction is a critical component towards accurate registrations needed for AR-guided surgery and other downstream AR applications such as remote assistance or surgical simulation. In this work, we utilize a state-of-the-art (SOTA) deep-learning-based visual simultaneous localization and mapping (vSLAM) algorithm to generate a dense 3D reconstruction with camera pose estimations and depth maps from video obtained with a monocular laparoscope. The proposed method can robustly reconstruct surgical scenes using real-time data and provide camera pose estimations without stereo or additional sensors, which increases its usability and is less intrusive. We also demonstrate a framework to evaluate current vSLAM algorithms on non-Lambertian, low-texture surfaces and explore using its outputs on downstream tasks. We expect these evaluation methods can be utilized for the continual refinement of newer algorithms for AR-guided surgery.
增强现实(AR)在外科手术中的应用越来越受到关注。AR引导的手术系统可以将术前成像中的分割解剖结构叠加到用户环境中,以便在术中描绘难以看到的结构和皮下病变。虽然先前的工作已经利用了诸如计算机断层扫描或磁共振图像等术前成像,但配准方法仍然缺乏在没有基准标记的情况下跨模态和维度准确配准可变形解剖结构的能力。对于通常使用单目腹腔镜的微创腹部手术技术来说,由于其固有的局限性,情况尤其如此。手术场景重建是AR引导手术以及其他下游AR应用(如远程协助或手术模拟)所需的精确配准的关键组成部分。在这项工作中,我们利用一种基于深度学习的最先进(SOTA)视觉同步定位与地图构建(vSLAM)算法,从单目腹腔镜获取的视频中生成带有相机位姿估计和深度图的密集三维重建。所提出的方法可以使用实时数据稳健地重建手术场景,并在无需立体视觉或额外传感器的情况下提供相机位姿估计,这提高了其可用性且侵入性较小。我们还展示了一个框架,用于评估当前vSLAM算法在非朗伯表面、低纹理表面上的性能,并探索在下游任务中使用其输出。我们期望这些评估方法可用于不断改进用于AR引导手术的更新算法。