IEEE Trans Image Process. 2022;31:823-838. doi: 10.1109/TIP.2021.3135708. Epub 2022 Jan 4.
Multi-modal retinal image registration plays an important role in the ophthalmological diagnosis process. The conventional methods lack robustness in aligning multi-modal images of various imaging qualities. Deep-learning methods have not been widely developed for this task, especially for the coarse-to-fine registration pipeline. To handle this task, we propose a two-step method based on deep convolutional networks, including a coarse alignment step and a fine alignment step. In the coarse alignment step, a global registration matrix is estimated by three sequentially connected networks for vessel segmentation, feature detection and description, and outlier rejection, respectively. In the fine alignment step, a deformable registration network is set up to find pixel-wise correspondence between a target image and a coarsely aligned image from the previous step to further improve the alignment accuracy. Particularly, an unsupervised learning framework is proposed to handle the difficulties of inconsistent modalities and lack of labeled training data for the fine alignment step. The proposed framework first changes multi-modal images into a same modality through modality transformers, and then adopts photometric consistency loss and smoothness loss to train the deformable registration network. The experimental results show that the proposed method achieves state-of-the-art results in Dice metrics and is more robust in challenging cases.
多模态视网膜图像配准在眼科诊断过程中起着重要作用。传统方法在对齐具有不同成像质量的多模态图像时缺乏鲁棒性。深度学习方法尚未广泛应用于这项任务,特别是对于粗到精的配准流水线。为了处理这项任务,我们提出了一种基于深度卷积网络的两步法,包括粗对准步骤和精对准步骤。在粗对准步骤中,通过三个依次连接的网络分别进行血管分割、特征检测和描述以及异常值剔除,来估计全局配准矩阵。在精对准步骤中,建立一个可变形配准网络,以在目标图像和前一步骤中粗对准的图像之间找到像素级对应关系,从而进一步提高对准精度。特别是,提出了一种无监督学习框架来处理精细对准步骤中模态不一致和缺乏标记训练数据的困难。该框架首先通过模态变换器将多模态图像转换为相同的模态,然后采用光度一致性损失和平滑度损失来训练可变形配准网络。实验结果表明,所提出的方法在 Dice 度量上达到了最先进的水平,并且在具有挑战性的情况下更具鲁棒性。