Suppr超能文献

用于自动裂缝分割的基于高效注意力机制的深度编码器和解码器

Efficient attention-based deep encoder and decoder for automatic crack segmentation.

作者信息

Kang Dong H, Cha Young-Jin

机构信息

University of Manitoba, Winnipeg, MB, Canada.

出版信息

Struct Health Monit. 2022 Sep;21(5):2190-2205. doi: 10.1177/14759217211053776. Epub 2021 Dec 19.

Abstract

Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.

摘要

最近,人们利用深度卷积神经网络对裂缝分割进行了研究。然而,在地面真值数据的准备、复杂场景的考虑、用于裂缝分割的特定对象网络的开发以及评估方法的使用等方面,仍存在重大不足。本文提出了一种新颖的语义Transformer表示网络(STRNet),用于在复杂场景中实时进行像素级的裂缝分割。STRNet由一个基于挤压与激励注意力的编码器、一个基于多头注意力的解码器、粗上采样、一个焦点Tversky损失函数以及一个可学习的Swish激活函数组成,通过保持快速处理速度来简洁地设计网络。还提出了一种评估图像场景复杂程度的方法。所提出的网络使用1203张图像进行训练,并通过进一步基于合成的扩充,使用545张测试图像(1280×720、1024×512)进行研究;在精度、召回率、F1分数和平均交并比(mIoU)方面分别达到了91.7%、92.7%、92.2%和92.6%。将其性能与最近开发的先进网络(注意力U-net、CrackSegNet、Deeplab V3+、FPHBN和Unet++)进行了比较,STRNet在评估指标中表现最佳——它以每秒49.2帧的速度实现了最快的处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7919/9411784/030fac339123/10.1177_14759217211053776-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验