Key Laboratory of Modern Teaching Technology, Ministry of Education, School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
Comput Biol Med. 2024 Aug;178:108773. doi: 10.1016/j.compbiomed.2024.108773. Epub 2024 Jun 25.
Extracting global and local feature information is still challenging due to the problems of retinal blood vessel medical images like fuzzy edge features, noise, difficulty in distinguishing between lesion regions and background information, and loss of low-level feature information, which leads to insufficient extraction of feature information. To better solve these problems and fully extract the global and local feature information of the image, we propose a novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation, which consists of an encoder and a decoder and is connected between the encoder and decoder by a transscale transformer cascade module. Among them, the encoder consists of a local-global transscale transformer module, a multi-head layered transscale adaptive embedding module, and a local context(LCNet) module. The transscale transformer cascade module learns local and global feature information from the first three layers of the encoder, and multi-scale dependent features, fuses the hierarchical feature information from the skip connection block and the channel-token interaction fusion block, respectively, and inputs it to the decoder. The decoder includes a decoding module for the local context network and a transscale position transformer module to input the local and global feature information extracted from the encoder with retained key position information into the decoding module and the position embedding transformer module for recovery and output of the prediction results that are consistent with the input feature information. In addition, we propose an improved cross-entropy loss function based on the difference between the deterministic observation samples and the prediction results with the deviation distance, which is validated on the DRIVE and STARE datasets combined with the proposed network model based on the dual transformer structure in this paper, and the segmentation accuracies are 97.26% and 97.87%, respectively. Compared with other state-of-the-art networks, the results show that the proposed network model has a significant competitive advantage in improving the segmentation performance of retinal blood vessel images.
由于视网膜血管医学图像存在边缘特征模糊、噪声、病变区域与背景信息难以区分、底层特征信息丢失等问题,导致特征信息提取不足,因此提取全局和局部特征信息仍然具有挑战性。为了更好地解决这些问题,充分提取图像的全局和局部特征信息,我们提出了一种新颖的用于增强视网膜血管分割的跨尺度级联分层 Transformer 网络,它由编码器和解码器组成,编码器和解码器之间通过跨尺度 Transformer 级联模块连接。其中,编码器由局部-全局跨尺度 Transformer 模块、多头分层跨尺度自适应嵌入模块和局部上下文(LCNet)模块组成。跨尺度 Transformer 级联模块从编码器的前三层学习局部和全局特征信息,并分别从 skip connection 块和 channel-token 交互融合块融合分层特征信息,然后将其输入解码器。解码器包括一个局部上下文网络的解码模块和一个跨尺度位置 Transformer 模块,用于将从编码器提取的局部和全局特征信息以及保留的关键位置信息输入解码模块和位置嵌入 Transformer 模块,以恢复和输出与输入特征信息一致的预测结果。此外,我们提出了一种基于确定性观测样本与具有偏差距离的预测结果之间差异的改进交叉熵损失函数,该函数结合本文提出的基于双 Transformer 结构的网络模型,在 DRIVE 和 STARE 数据集上进行了验证,分割精度分别为 97.26%和 97.87%。与其他最先进的网络相比,结果表明,所提出的网络模型在提高视网膜血管图像分割性能方面具有显著的竞争优势。