School of Medical Information Engineering, Gansu University of Traditional Chinese Medicine, Lanzhou, 730000, People's Republic of China.
Orthopedic Traumatology Hospital, Quanzhou, Fujian, 362000, People's Republic of China.
Biomed Phys Eng Express. 2024 Aug 12;10(5). doi: 10.1088/2057-1976/ad6992.
. Existing registration networks based on cross-attention design usually divide the image pairs to be registered into patches for input. The division and merging operations of a series of patches are difficult to maintain the topology of the deformation field and reduce the interpretability of the network. Therefore, our goal is to develop a new network architecture based on a cross-attention mechanism combined with a multi-resolution strategy to improve the accuracy and interpretability of medical image registration.. We propose a new deformable image registration network NCNet based on neighborhood cross-attention combined with multi-resolution strategy. The network structure mainly consists of a multi-resolution feature encoder, a multi-head neighborhood cross-attention module and a registration decoder. The hierarchical feature extraction capability of our encoder is improved by introducing large kernel parallel convolution blocks; the cross-attention module based on neighborhood calculation is used to reduce the impact on the topology of the deformation field and double normalization is used to reduce its computational complexity.. We performed atlas-based registration and inter-subject registration tasks on the public 3D brain magnetic resonance imaging datasets LPBA40 and IXI respectively. Compared with the popular VoxelMorph method, our method improves the average DSC value by 7.9% and 3.6% on LPBA40 and IXI. Compared with the popular TransMorph method, our method improves the average DSC value by 4.9% and 1.3% on LPBA40 and IXI.. We proved the advantages of the neighborhood attention calculation method compared to the window attention calculation method based on partitioning patches, and analyzed the impact of the pyramid feature encoder and double normalization on network performance. This has made a valuable contribution to promoting the further development of medical image registration methods.
. 现有的基于交叉注意力设计的配准网络通常将待配准的图像对分割成补丁进行输入。一系列补丁的分割和合并操作难以保持变形场的拓扑结构,并降低网络的可解释性。因此,我们的目标是开发一种新的基于交叉注意力机制与多分辨率策略相结合的网络架构,以提高医学图像配准的准确性和可解释性。. 我们提出了一种新的基于邻域交叉注意力与多分辨率策略相结合的可变形图像配准网络 NCNet。该网络结构主要由多分辨率特征编码器、多头邻域交叉注意力模块和配准解码器组成。通过引入大核并行卷积块,提高了我们的编码器的分层特征提取能力;使用基于邻域计算的交叉注意力模块减少了对变形场拓扑的影响,并使用双归一化减少了其计算复杂度。. 我们在公共的 3D 脑磁共振成像数据集 LPBA40 和 IXI 上分别进行了基于图谱的配准和受试者间配准任务。与流行的 VoxelMorph 方法相比,我们的方法在 LPBA40 和 IXI 上分别将平均 DSC 值提高了 7.9%和 3.6%。与流行的 TransMorph 方法相比,我们的方法在 LPBA40 和 IXI 上分别将平均 DSC 值提高了 4.9%和 1.3%。. 我们证明了基于邻域注意力计算方法相对于基于分区补丁的窗口注意力计算方法的优势,并分析了金字塔特征编码器和双归一化对网络性能的影响。这为促进医学图像配准方法的进一步发展做出了有价值的贡献。