Guo Maozu, Tian Wenbo, Li Yang, Sui Dong
School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102616, China.
Beijing Key Laboratory for Intelligent Processing Methods of Architectural Big Data, Beijing University of Civil Engineering and Architecture, Beijing 102616, China.
Sensors (Basel). 2024 May 21;24(11):3268. doi: 10.3390/s24113268.
Structural health monitoring for roads is an important task that supports inspection of transportation infrastructure. This paper explores deep learning techniques for crack detection in road images and proposes an automatic pixel-level semantic road crack image segmentation method based on a Swin transformer. This method employs Swin-T as the backbone network to extract feature information from crack images at various levels and utilizes the texture unit to extract the texture and edge characteristic information of cracks. The refinement attention module (RAM) and panoramic feature module (PFM) then merge these diverse features, ultimately refining the segmentation results. This method is called FetNet. We collect four public real-world datasets and conduct extensive experiments, comparing FetNet with various deep-learning methods. FetNet achieves the highest precision of 90.4%, a recall of 85.3%, an F1 score of 87.9%, and a mean intersection over union of 78.6% on the Crack500 dataset. The experimental results show that the FetNet approach surpasses other advanced models in terms of crack segmentation accuracy and exhibits excellent generalizability for use in complex scenes.
道路结构健康监测是一项支持交通基础设施检测的重要任务。本文探索了用于道路图像裂缝检测的深度学习技术,并提出了一种基于Swin变压器的自动像素级语义道路裂缝图像分割方法。该方法采用Swin-T作为骨干网络,从不同层次的裂缝图像中提取特征信息,并利用纹理单元提取裂缝的纹理和边缘特征信息。然后,细化注意力模块(RAM)和全景特征模块(PFM)合并这些不同的特征,最终细化分割结果。该方法称为FetNet。我们收集了四个公共真实世界数据集并进行了广泛的实验,将FetNet与各种深度学习方法进行比较。在Crack500数据集上,FetNet实现了90.4%的最高精度、85.3%的召回率、87.9%的F1分数和78.6%的平均交并比。实验结果表明,FetNet方法在裂缝分割精度方面优于其他先进模型,并且在复杂场景中具有出色的通用性。