Liu Guoxi, Wu Xiaojing, Dai Fei, Liu Guozhi, Li Lecheng, Huang Bi
College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming 650224, China.
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510641, China.
Sensors (Basel). 2025 Apr 12;25(8):2446. doi: 10.3390/s25082446.
Pavement crack detection is crucial for ensuring road safety and reducing maintenance costs. Existing methods typically use convolutional neural networks (CNNs) to extract multi-level features from pavement images and employ attention mechanisms to enhance global features. However, the fusion of low-level features introduces substantial interference, leading to low detection accuracy for small-scale cracks with subtle local structures and varying global morphologies. In this paper, we propose a computationally efficient deep learning network with CNNs and multi-scale attention for multi-scale crack detection, named Crack-MsCGA. In this network, we avoid fusing low-level features to reduce noise interference. Then, we propose a multi-scale attention mechanism (MsCGA) to learn local detail features and global features from high-level features, compensating for the reduced detailed information. Specifically, first, MsCGA employs local window attention to learn short-range dependencies, aggregating local features within each window. Second, it applies a cascaded group attention mechanism to learn long-range dependencies, extracting global features across the entire image. Finally, it uses a multi-scale attention fusion strategy based on Mixed Local Channel Attention (MLCA) selectively to fuse local features and global features of pavement cracks. Compared with five existing methods, it improves the AP@50 by 11.3% for small-scale, 8.1% for medium-scale, and 5.9% for large-scale detection over the state-of-the-art methods in the DH807 dataset.
路面裂缝检测对于确保道路安全和降低维护成本至关重要。现有方法通常使用卷积神经网络(CNN)从路面图像中提取多级特征,并采用注意力机制来增强全局特征。然而,低级特征的融合会引入大量干扰,导致对具有细微局部结构和不同全局形态的小尺度裂缝的检测精度较低。在本文中,我们提出了一种计算效率高的深度学习网络,即带有CNN和多尺度注意力的Crack-MsCGA,用于多尺度裂缝检测。在这个网络中,我们避免融合低级特征以减少噪声干扰。然后,我们提出了一种多尺度注意力机制(MsCGA),从高级特征中学习局部细节特征和全局特征,以弥补减少的详细信息。具体来说,首先,MsCGA采用局部窗口注意力来学习短程依赖关系,聚合每个窗口内的局部特征。其次,它应用级联组注意力机制来学习长程依赖关系,提取整个图像的全局特征。最后,它使用基于混合局部通道注意力(MLCA)的多尺度注意力融合策略,有选择地融合路面裂缝的局部特征和全局特征。与五种现有方法相比,在DH807数据集中,对于小尺度检测,其AP@50比最先进的方法提高了11.3%,对于中等尺度提高了8.1%,对于大尺度提高了5.9%。