Wang Cong, Gan Meng
Jiangsu Key Laboratory of Medical Optics, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou 215163, China.
Biomed Opt Express. 2021 Apr 7;12(5):2631-2646. doi: 10.1364/BOE.419809. eCollection 2021 May 1.
Automatic segmentation of layered tissue is the key to esophageal optical coherence tomography (OCT) image processing. With the advent of deep learning techniques, frameworks based on a fully convolutional network are proved to be effective in classifying pixels on images. However, due to speckle noise and unfavorable imaging conditions, the esophageal tissue relevant to the diagnosis is not always easy to identify. An effective approach to address this problem is extracting more powerful feature maps, which have similar expressions for pixels in the same tissue and show discriminability from those from different tissues. In this study, we proposed a novel framework, called the tissue self-attention network (TSA-Net), which introduces the self-attention mechanism for esophageal OCT image segmentation. The self-attention module in the network is able to capture long-range context dependencies from the image and analyzes the input image in a global view, which helps to cluster pixels in the same tissue and reveal differences of different layers, thus achieving more powerful feature maps for segmentation. Experiments have visually illustrated the effectiveness of the self-attention map, and its advantages over other deep networks were also discussed.
分层组织的自动分割是食管光学相干断层扫描(OCT)图像处理的关键。随着深度学习技术的出现,基于全卷积网络的框架被证明在图像像素分类方面是有效的。然而,由于散斑噪声和不利的成像条件,与诊断相关的食管组织并不总是易于识别。解决这个问题的一种有效方法是提取更强大的特征图,这些特征图对于同一组织中的像素具有相似的表达,并且与来自不同组织的像素表现出可区分性。在本研究中,我们提出了一种新颖的框架,称为组织自注意力网络(TSA-Net),它将自注意力机制引入食管OCT图像分割。网络中的自注意力模块能够从图像中捕获长距离上下文依赖关系,并从全局视角分析输入图像,这有助于将同一组织中的像素聚类,并揭示不同层的差异,从而实现更强大的用于分割的特征图。实验直观地说明了自注意力图的有效性,并讨论了其相对于其他深度网络的优势。