Eltouny Kareem, Sajedi Seyedomid, Liang Xiao
Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260, USA.
Structural Mechanics & Materials Division, Simpson Gumpertz & Heger, Waltham, MA 02451, USA.
Sensors (Basel). 2024 Sep 17;24(18):6007. doi: 10.3390/s24186007.
Developments in drones and imaging hardware technology have opened up countless possibilities for enhancing structural condition assessments and visual inspections. However, processing the inspection images requires considerable work hours, leading to delays in the assessment process. This study presents a semantic segmentation architecture that integrates vision transformers with Laplacian pyramid scaling networks, enabling rapid and accurate pixel-level damage detection. Unlike conventional methods that often lose critical details through resampling or cropping high-resolution images, our approach preserves essential inspection-related information such as microcracks and edges using non-uniform image rescaling networks. This innovation allows for detailed damage identification of high-resolution images while significantly reducing the computational demands. Our main contributions in this study are: (1) proposing two rescaling networks that together allow for processing high-resolution images while significantly reducing the computational demands; and (2) proposing Dmg2Former, a low-resolution segmentation network with a Swin Transformer backbone that leverages the saved computational resources to produce detailed visual inspection masks. We validate our method through a series of experiments on publicly available visual inspection datasets, addressing various tasks such as crack detection and material identification. Finally, we examine the computational efficiency of the adaptive rescalers in terms of multiply-accumulate operations and GPU-memory requirements.
无人机和成像硬件技术的发展为加强结构状况评估和目视检查带来了无数可能性。然而,处理检查图像需要耗费大量工时,导致评估过程延迟。本研究提出了一种语义分割架构,该架构将视觉变换器与拉普拉斯金字塔缩放网络集成在一起,能够实现快速且准确的像素级损伤检测。与传统方法不同,传统方法常常通过对高分辨率图像进行重采样或裁剪而丢失关键细节,我们的方法使用非均匀图像缩放网络保留了与检查相关的重要信息,如微裂纹和边缘。这一创新使得能够对高分辨率图像进行详细的损伤识别,同时显著降低计算需求。我们在本研究中的主要贡献包括:(1)提出了两个缩放网络,它们共同使得能够处理高分辨率图像,同时显著降低计算需求;(2)提出了Dmg2Former,这是一个具有Swin Transformer主干的低分辨率分割网络,它利用节省的计算资源生成详细的目视检查掩码。我们通过在公开可用的目视检查数据集上进行一系列实验来验证我们的方法,这些实验涉及诸如裂纹检测和材料识别等各种任务。最后,我们从乘法累加运算和GPU内存需求方面考察了自适应缩放器的计算效率。