Jiang Qingxin, Fan Ying, Li Menghan, Fang Sheng, Zhu Weifang, Xiang Dehui, Peng Tao, Chen Xinjian, Xu Xun, Shi Fei
MIPAV Lab, School of Electronic and Information Engineering, Soochow University, Suzhou 215006, China.
Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, China.
Biomed Opt Express. 2024 Oct 2;15(11):6156-6170. doi: 10.1364/BOE.538959. eCollection 2024 Nov 1.
Optical coherence tomography (OCT) has become the leading imaging technique in diagnosing and treatment planning for retinal diseases. Retinal OCT image segmentation involves extracting lesions and/or tissue structures to aid in the decisions of ophthalmologists, and multi-class segmentation is commonly needed. As the target regions often spread widely inside the retina, and the intensities and locations of different categories can be close, good segmentation networks must possess both global modeling capabilities and the ability to capture fine details. To address the challenge in capturing both global and local features simultaneously, we propose HyFormer, an efficient, lightweight, and robust hybrid network architecture. The proposed architecture features parallel Transformer and convolutional encoders for independent feature capture. A multi-scale gated attention block and a group positional embedding block are introduced within the Transformer encoder to enhance feature extraction. Feature integration is achieved in the decoder composed of the proposed three-path fusion modules. A class activation map-based cross-entropy loss function is also proposed to improve segmentation results. Evaluations are performed on a private dataset with myopic traction maculopathy lesions and the public AROI dataset for retinal layer and lesion segmentation with age-related degeneration. The results demonstrate HyFormer's superior segmentation performance and robustness compared to existing methods, showing promise for accurate and efficient OCT image segmentation. .
光学相干断层扫描(OCT)已成为视网膜疾病诊断和治疗规划中的领先成像技术。视网膜OCT图像分割涉及提取病变和/或组织结构,以协助眼科医生做出决策,通常需要进行多类别分割。由于目标区域在视网膜内往往广泛分布,且不同类别的强度和位置可能相近,因此良好的分割网络必须具备全局建模能力和捕捉精细细节的能力。为了应对同时捕捉全局和局部特征的挑战,我们提出了HyFormer,一种高效、轻量级且强大的混合网络架构。所提出的架构具有并行的Transformer和卷积编码器,用于独立特征捕捉。在Transformer编码器中引入了多尺度门控注意力块和组位置嵌入块,以增强特征提取。特征集成在由所提出的三路径融合模块组成的解码器中实现。还提出了一种基于类激活映射的交叉熵损失函数来改善分割结果。在一个包含近视牵引性黄斑病变的私有数据集以及用于年龄相关性黄斑变性的视网膜层和病变分割的公共AROI数据集上进行了评估。结果表明,与现有方法相比,HyFormer具有卓越的分割性能和鲁棒性,为准确高效的OCT图像分割展现出了前景。