Yadav Dhirendra Prasad, Sharma Bhisham, Chauhan Shivank, Dhaou Imed Ben
Department of Computer Engineering & Applications, G.L.A. University, Mathura 281406, India.
Centre of Research Impact and Outcome, Chitkara University, Rajpura 140401, Punjab, India.
Sensors (Basel). 2024 Jun 30;24(13):4257. doi: 10.3390/s24134257.
Detecting cracks in building structures is an essential practice that ensures safety, promotes longevity, and maintains the economic value of the built environment. In the past, machine learning (ML) and deep learning (DL) techniques have been used to enhance classification accuracy. However, the conventional CNN (convolutional neural network) methods incur high computational costs owing to their extensive number of trainable parameters and tend to extract only high-dimensional shallow features that may not comprehensively represent crack characteristics. We proposed a novel convolution and composite attention transformer network (CCTNet) model to address these issues. CCTNet enhances crack identification by processing more input pixels and combining convolution channel attention with window-based self-attention mechanisms. This dual approach aims to leverage the localized feature extraction capabilities of CNNs with the global contextual understanding afforded by self-attention mechanisms. Additionally, we applied an improved cross-attention module within CCTNet to increase the interaction and integration of features across adjacent windows. The performance of CCTNet on the Historical Building Crack2019, SDTNET2018, and proposed DS3 has a precision of 98.60%, 98.93%, and 99.33%, respectively. Furthermore, the training validation loss of the proposed model is close to zero. In addition, the AUC (area under the curve) is 0.99 and 0.98 for the Historical Building Crack2019 and SDTNET2018, respectively. CCTNet not only outperforms existing methodologies but also sets a new standard for the accurate, efficient, and reliable detection of cracks in building structures.
检测建筑结构中的裂缝是一项至关重要的工作,它能确保安全、延长使用寿命并维护建筑环境的经济价值。过去,机器学习(ML)和深度学习(DL)技术已被用于提高分类精度。然而,传统的卷积神经网络(CNN)方法由于其大量的可训练参数而产生高昂的计算成本,并且往往只提取可能无法全面表征裂缝特征的高维浅层特征。我们提出了一种新颖的卷积与复合注意力变压器网络(CCTNet)模型来解决这些问题。CCTNet通过处理更多的输入像素,并将卷积通道注意力与基于窗口的自注意力机制相结合来增强裂缝识别。这种双重方法旨在利用CNN的局部特征提取能力以及自注意力机制提供的全局上下文理解。此外,我们在CCTNet中应用了一种改进的交叉注意力模块,以增加相邻窗口之间特征的交互和整合。CCTNet在Historical Building Crack2019、SDTNET2018和所提出的DS3数据集上的性能,其精度分别为98.60%、98.93%和99.33%。此外,所提出模型的训练验证损失接近零。另外,对于Historical Building Crack2019和SDTNET2018,AUC(曲线下面积)分别为0.99和0.98。CCTNet不仅优于现有方法,还为建筑结构裂缝的准确、高效和可靠检测树立了新的标准。