De Silva Tharindu, Chew Emily Y, Hotaling Nathan, Cukras Catherine A
National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA.
National Center for Advancing Translational Science, National Institutes of Health, Bethesda, MD 20892, USA.
Biomed Opt Express. 2020 Dec 23;12(1):619-636. doi: 10.1364/BOE.408573. eCollection 2021 Jan 1.
This work reports a deep-learning based registration algorithm that aligns multi-modal retinal images collected from longitudinal clinical studies to achieve accuracy and robustness required for analysis of structural changes in large-scale clinical data. Deep-learning networks that mirror the architecture of conventional feature-point-based registration were evaluated with different networks that solved for registration affine parameters, image patch displacements, and patch displacements within the region of overlap. The ground truth images for deep learning-based approaches were derived from successful conventional feature-based registration. Cross-sectional and longitudinal affine registrations were performed across color fundus photography (CFP), fundus autofluorescence (FAF), and infrared reflectance (IR) image modalities. For mono-modality longitudinal registration, the conventional feature-based registration method achieved mean errors in the range of 39-53 µm (depending on the modality) whereas the deep learning method with region overlap prediction exhibited mean errors in the range 54-59 µm. For cross-sectional multi-modality registration, the conventional method exhibited gross failures with large errors in more than 50% of the cases while the proposed deep-learning method achieved robust performance with no gross failures and mean errors in the range 66-69 µm. Thus, the deep learning-based method achieved superior overall performance across all modalities. The accuracy and robustness reported in this work provide important advances that will facilitate clinical research and enable a detailed study of the progression of retinal diseases such as age-related macular degeneration.
这项工作报告了一种基于深度学习的配准算法,该算法可将从纵向临床研究中收集的多模态视网膜图像进行对齐,以实现大规模临床数据分析所需的准确性和稳健性。对模仿传统基于特征点的配准架构的深度学习网络,与求解配准仿射参数、图像块位移以及重叠区域内块位移的不同网络进行了评估。基于深度学习方法的真实图像源自成功的传统基于特征的配准。对彩色眼底照相(CFP)、眼底自发荧光(FAF)和红外反射(IR)图像模态进行了横断面和纵向仿射配准。对于单模态纵向配准,传统基于特征的配准方法实现的平均误差在39 - 53微米范围内(取决于模态),而具有区域重叠预测的深度学习方法的平均误差在54 - 59微米范围内。对于横断面多模态配准,传统方法在超过50%的情况下出现了大误差的严重失败,而所提出的深度学习方法实现了稳健的性能,没有严重失败,平均误差在66 - 69微米范围内。因此,基于深度学习的方法在所有模态上都实现了卓越的整体性能。这项工作中报告的准确性和稳健性提供了重要进展,将有助于临床研究,并能够对视网膜疾病如年龄相关性黄斑变性的进展进行详细研究。