Lab for Bone Metabolism, Xi'an Key Laboratory of Special Medicine and Health Engineering, Key Lab for Space Biosciences and Biotechnology, Research Center for Special Medicine and Health Systems Engineering, NPU-UAB Joint Laboratory for Bone Metabolism, School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China.
Lab for Bone Metabolism, Xi'an Key Laboratory of Special Medicine and Health Engineering, Key Lab for Space Biosciences and Biotechnology, Research Center for Special Medicine and Health Systems Engineering, NPU-UAB Joint Laboratory for Bone Metabolism, School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China.
Comput Biol Med. 2023 Oct;165:107460. doi: 10.1016/j.compbiomed.2023.107460. Epub 2023 Sep 9.
The convolutional neural network (CNN) and Transformer play an important role in computer-aided diagnosis and intelligent medicine. However, CNN cannot obtain long-range dependence, and Transformer has shortcomings in computational complexity and a large number of parameters. Recently, compared with CNN and Transformer, the Multi-Layer Perceptron (MLP)-based medical image processing network can achieve higher accuracy with smaller computational and parametric quantities. Hence, in this work, we propose an encoder-decoder network, U-MLP, based on the ReMLP block. The ReMLP block contains an overlapping sliding window mechanism and a Multi-head Gate Self-Attention (MGSA) module, where the overlapping sliding window can extract local features of the image like convolution, then combines MGSA to fuse the information extracted from multiple dimensions to obtain more contextual semantic information. Meanwhile, to increase the generalization ability of the model, we design the Vague Region Refinement (VRRE) module, which uses the primary features generated by network inference to create local reference features, thus determining the pixel class by inferring the proximity between local features and labeled features. Extensive experimental evaluation shows U-MLP boosts the performance of segmentation. In the skin lesions, spleen, and left atrium segmentation on three benchmark datasets, our U-MLP method achieved a dice similarity coefficient of 88.27%, 97.61%, and 95.91% on the test set, respectively, outperforming 7 state-of-the-art methods.
卷积神经网络(CNN)和 Transformer 在计算机辅助诊断和智能医学中发挥着重要作用。然而,CNN 无法获取长程依赖,Transformer 在计算复杂度和大量参数方面存在不足。最近,与 CNN 和 Transformer 相比,基于多层感知机(MLP)的医学图像处理网络可以用更小的计算量和参数量实现更高的准确性。因此,在这项工作中,我们提出了一种基于 ReMLP 块的编码器-解码器网络 U-MLP。ReMLP 块包含重叠滑动窗口机制和多头门控自注意力(MGSA)模块,其中重叠滑动窗口可以像卷积一样提取图像的局部特征,然后结合 MGSA 融合从多个维度提取的信息,以获得更多的上下文语义信息。同时,为了提高模型的泛化能力,我们设计了模糊区域细化(VRRE)模块,该模块使用网络推断生成的主要特征来创建局部参考特征,从而通过推断局部特征与标记特征之间的接近程度来确定像素类别。广泛的实验评估表明,U-MLP 提高了分割性能。在三个基准数据集上的皮肤病变、脾脏和左心房分割中,我们的 U-MLP 方法在测试集上的骰子相似系数分别达到 88.27%、97.61%和 95.91%,优于 7 种最先进的方法。