Ye Wentao, Wu Jianghong, Zhang Wei, Sun Liyang, Dong Xue, Xu Shuogui
China-UK Low Carbon College, Shanghai Jiao Tong University, Shanghai 200240, China.
Xinqiao Hospital, Chongqing 400037, China.
Bioengineering (Basel). 2025 Jan 26;12(2):114. doi: 10.3390/bioengineering12020114.
In image-guided surgery (IGS) practice, combining intraoperative 2D X-ray images with preoperative 3D X-ray images from computed tomography (CT) enables the rapid and accurate localization of lesions, which allows for a more minimally invasive and efficient surgery, and also reduces the risk of secondary injuries to nerves and vessels. Conventional optimization-based methods for 2D X-ray and 3D CT matching are limited in speed and precision due to non-convex optimization spaces and a constrained searching range. Recently, deep learning (DL) approaches have demonstrated remarkable proficiency in solving complex nonlinear 2D-3D registration. In this paper, a fast and robust DL-based registration method is proposed that takes an intraoperative 2D X-ray image as input, compares it with the preoperative 3D CT, and outputs their relative pose in x, y, z and pitch, yaw, roll. The method employs a dual-channel Swin transformer feature extractor equipped with attention mechanisms and feature pyramid to facilitate the correlation between features of the 2D X-ray and anatomical pose of CT. Tests on three different regions of interest acquired from open-source datasets show that our method can achieve high pose estimation accuracy (mean rotation and translation error of 0.142° and 0.362 mm, respectively) in a short time (0.02 s). Robustness tests indicate that our proposed method can maintain zero registration failures across varying levels of noise. This generalizable learning-based 2D (X-ray) and 3D (CT) registration algorithm owns promising applications in surgical navigation, targeted radiotherapy, and other clinical operations, with substantial potential for enhancing the accuracy and efficiency of image-guided surgery.
在图像引导手术(IGS)实践中,将术中二维X射线图像与术前计算机断层扫描(CT)的三维X射线图像相结合,能够快速、准确地定位病变,从而实现更微创、高效的手术,同时降低神经和血管二次损伤的风险。由于非凸优化空间和受限的搜索范围,传统的基于优化的二维X射线和三维CT匹配方法在速度和精度方面存在局限性。最近,深度学习(DL)方法在解决复杂的非线性二维-三维配准方面表现出了卓越的能力。本文提出了一种快速、稳健的基于深度学习的配准方法,该方法以术中二维X射线图像为输入,与术前三维CT进行比较,并输出它们在x、y、z方向以及俯仰、偏航、滚转方向上的相对姿态。该方法采用了配备注意力机制和特征金字塔的双通道Swin变压器特征提取器,以促进二维X射线特征与CT解剖姿态之间的相关性。对从开源数据集中获取的三个不同感兴趣区域进行的测试表明,我们的方法能够在短时间内(0.02秒)实现高精度的姿态估计(平均旋转误差和平移误差分别为0.142°和0.362毫米)。稳健性测试表明,我们提出的方法在不同噪声水平下均能保持零配准失败。这种基于可推广学习的二维(X射线)和三维(CT)配准算法在手术导航、靶向放疗和其他临床操作中具有广阔的应用前景,在提高图像引导手术的准确性和效率方面具有巨大潜力。