Cao Guogang, Wu Yan, Peng Zeyu, Zhou Zhilin, Dai Cuixia
Shanghai Institute of Technology, Shanghai 201418, China.
Biomed Opt Express. 2024 Feb 13;15(3):1605-1617. doi: 10.1364/BOE.510464. eCollection 2024 Mar 1.
The structure of the retinal layers provides valuable diagnostic information for many ophthalmic diseases. Optical coherence tomography (OCT) obtains cross-sectional images of the retina, which reveals information about the retinal layers. The U-net based approaches are prominent in retinal layering methods, which are usually beneficial to local characteristics but not good at obtaining long-distance dependence for contextual information. Furthermore, the morphology of retinal layers with the disease is more complex, which brings more significant challenges to the task of retinal layer segmentation. We propose a U-shaped network combining an encoder-decoder architecture and self-attention mechanisms. In response to the characteristics of retinal OCT cross-sectional images, a self-attentive module in the vertical direction is added to the bottom of the U-shaped network, and an attention mechanism is also added in skip connection and up-sampling to enhance essential features. In this method, the transformer's self-attentive mechanism obtains the global field of perception, thus providing the missing context information for convolutions, and the convolutional neural network also efficiently extracts local features, compensating the local details the transformer ignores. The experiment results showed that our method is accurate and better than other methods for segmentation of the retinal layers, with the average Dice scores of 0.871 and 0.820, respectively, on two public retinal OCT image datasets. To perform the layer segmentation of retinal OCT image better, the proposed method incorporates the transformer's self-attention mechanism in a U-shaped network, which is helpful for ophthalmic disease diagnosis.
视网膜各层的结构为许多眼科疾病提供了有价值的诊断信息。光学相干断层扫描(OCT)可获取视网膜的横截面图像,从而揭示有关视网膜各层的信息。基于U-net的方法在视网膜分层方法中表现突出,这类方法通常有利于局部特征提取,但在获取上下文信息的长距离依赖性方面表现不佳。此外,患有疾病的视网膜层形态更为复杂,这给视网膜层分割任务带来了更大的挑战。我们提出了一种结合编码器-解码器架构和自注意力机制的U形网络。针对视网膜OCT横截面图像的特点,在U形网络底部添加了一个垂直方向的自注意力模块,并在跳跃连接和上采样中也添加了注意力机制以增强关键特征。在这种方法中,Transformer的自注意力机制获得全局感知域,从而为卷积提供缺失的上下文信息,而卷积神经网络也能有效地提取局部特征,弥补Transformer忽略的局部细节。实验结果表明,我们的方法在视网膜层分割方面准确且优于其他方法,在两个公开的视网膜OCT图像数据集上的平均Dice分数分别为0.871和0.820。为了更好地进行视网膜OCT图像的层分割,所提出的方法在U形网络中融入了Transformer的自注意力机制,这有助于眼科疾病的诊断。