The College of Health Sciences and Engineering, University of Shanghai for Science and Technology, 516 Jungong Highway, Yangpu Area, Shanghai, 200093, China.
The College of Medical Technology, Shanghai University of Medicine & Health Sciences, 279 Zhouzhu Highway, Pudong New Area, Shanghai, 201318, China.
Biomed Eng Online. 2024 Mar 5;23(1):27. doi: 10.1186/s12938-024-01209-z.
Deep Self-Attention Network (Transformer) is an encoder-decoder architectural model that excels in establishing long-distance dependencies and is first applied in natural language processing. Due to its complementary nature with the inductive bias of convolutional neural network (CNN), Transformer has been gradually applied to medical image processing, including kidney image processing. It has become a hot research topic in recent years. To further explore new ideas and directions in the field of renal image processing, this paper outlines the characteristics of the Transformer network model and summarizes the application of the Transformer-based model in renal image segmentation, classification, detection, electronic medical records, and decision-making systems, and compared with CNN-based renal image processing algorithm, analyzing the advantages and disadvantages of this technique in renal image processing. In addition, this paper gives an outlook on the development trend of Transformer in renal image processing, which provides a valuable reference for a lot of renal image analysis.
深度自注意力网络(Transformer)是一种编码器-解码器架构模型,擅长建立远距离依赖关系,最初应用于自然语言处理。由于它与卷积神经网络(CNN)的归纳偏差具有互补性,Transformer 已逐渐应用于医学图像处理,包括肾脏图像处理。近年来,它已成为一个热门研究课题。为了进一步探索肾脏图像处理领域的新思想和新方向,本文概述了 Transformer 网络模型的特点,并总结了基于 Transformer 的模型在肾脏图像分割、分类、检测、电子病历和决策系统中的应用,并与基于 CNN 的肾脏图像处理算法进行了比较,分析了该技术在肾脏图像处理中的优缺点。此外,本文还展望了 Transformer 在肾脏图像处理中的发展趋势,为大量肾脏图像分析提供了有价值的参考。