Mayya Ali Mahmoud, Alkayem Nizar Faisal
Computer and Automatic Control Engineering Department, Faculty of Mechanical and Electrical Engineering, Tishreen University, Lattakia 2230, Syria.
College of Automation and College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing 210046, China.
Sensors (Basel). 2024 Dec 19;24(24):8095. doi: 10.3390/s24248095.
Early identification of concrete cracks and multi-class detection can help to avoid future deformation or collapse in concrete structures. Available traditional detection and methodologies require enormous effort and time. To overcome such difficulties, current vision-based deep learning models can effectively detect and classify various concrete cracks. This study introduces a novel multi-stage deep learning framework for crack detection and type classification. First, the recently developed YOLOV10 model is trained to detect possible defective regions in concrete images. After that, a modified vision transformer (ViT) model is trained to classify concrete images into three main types: normal, simple cracks, and multi-branched cracks. The evaluation process includes feeding concrete test images into the trained YOLOV10 model, identifying the possible defect regions, and finally delivering the detected regions into the trained ViT model, which decides the appropriate crack type of those detected regions. Experiments are conducted using the individual ViT model and the proposed multi-stage framework. To improve the generation ability, multi-source datasets of concrete structures are used. For the classification part, a concrete crack dataset consisting of 12,000 images of three classes is utilized, while for the detection part, a dataset composed of various materials from historical buildings containing 1116 concrete images with their corresponding bounding boxes, is utilized. Results prove that the proposed multi-stage model accurately classifies crack types with 90.67% precision, 90.03% recall, and 90.34% F1-score. The results also show that the proposed model outperforms the individual classification model by 10.9%, 19.99%, and 19.2% for precision, recall, and F1-score, respectively. The proposed multi-stage YOLOV10-ViT model can be integrated into the construction systems which are based on crack materials to obtain early warning of possible future deformation in concrete structures.
早期识别混凝土裂缝和多类别检测有助于避免混凝土结构未来的变形或坍塌。现有的传统检测方法需要耗费大量精力和时间。为克服这些困难,当前基于视觉的深度学习模型能够有效地检测和分类各种混凝土裂缝。本研究引入了一种用于裂缝检测和类型分类的新型多阶段深度学习框架。首先,对最近开发的YOLOV10模型进行训练,以检测混凝土图像中可能存在缺陷的区域。之后,训练一个改进的视觉Transformer(ViT)模型,将混凝土图像分为三种主要类型:正常、简单裂缝和多分支裂缝。评估过程包括将混凝土测试图像输入到训练好的YOLOV10模型中,识别可能的缺陷区域,最后将检测到的区域输入到训练好的ViT模型中,由其确定这些检测区域的适当裂缝类型。使用单独的ViT模型和所提出的多阶段框架进行实验。为提高生成能力,使用了混凝土结构的多源数据集。对于分类部分,使用了一个由三类12000张图像组成的混凝土裂缝数据集,而对于检测部分,使用了一个由历史建筑的各种材料组成的数据集,其中包含1116张混凝土图像及其相应的边界框。结果证明,所提出的多阶段模型以90.67%的精度、90.03%的召回率和90.34%的F1分数准确地对裂缝类型进行了分类。结果还表明,所提出的模型在精度、召回率和F1分数方面分别比单独的分类模型高出10.9%、19.99%和19.2%。所提出的多阶段YOLOV10-ViT模型可以集成到基于裂缝材料的施工系统中,以获得混凝土结构未来可能变形的早期预警。