• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Dmg2Former-AR:用于高分辨率结构视觉检测的具有自适应重缩放功能的视觉Transformer

Dmg2Former-AR: Vision Transformers with Adaptive Rescaling for High-Resolution Structural Visual Inspection.

作者信息

Eltouny Kareem, Sajedi Seyedomid, Liang Xiao

机构信息

Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260, USA.

Structural Mechanics & Materials Division, Simpson Gumpertz & Heger, Waltham, MA 02451, USA.

出版信息

Sensors (Basel). 2024 Sep 17;24(18):6007. doi: 10.3390/s24186007.

DOI:10.3390/s24186007
PMID:39338752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11436102/
Abstract

Developments in drones and imaging hardware technology have opened up countless possibilities for enhancing structural condition assessments and visual inspections. However, processing the inspection images requires considerable work hours, leading to delays in the assessment process. This study presents a semantic segmentation architecture that integrates vision transformers with Laplacian pyramid scaling networks, enabling rapid and accurate pixel-level damage detection. Unlike conventional methods that often lose critical details through resampling or cropping high-resolution images, our approach preserves essential inspection-related information such as microcracks and edges using non-uniform image rescaling networks. This innovation allows for detailed damage identification of high-resolution images while significantly reducing the computational demands. Our main contributions in this study are: (1) proposing two rescaling networks that together allow for processing high-resolution images while significantly reducing the computational demands; and (2) proposing Dmg2Former, a low-resolution segmentation network with a Swin Transformer backbone that leverages the saved computational resources to produce detailed visual inspection masks. We validate our method through a series of experiments on publicly available visual inspection datasets, addressing various tasks such as crack detection and material identification. Finally, we examine the computational efficiency of the adaptive rescalers in terms of multiply-accumulate operations and GPU-memory requirements.

摘要

无人机和成像硬件技术的发展为加强结构状况评估和目视检查带来了无数可能性。然而,处理检查图像需要耗费大量工时,导致评估过程延迟。本研究提出了一种语义分割架构,该架构将视觉变换器与拉普拉斯金字塔缩放网络集成在一起,能够实现快速且准确的像素级损伤检测。与传统方法不同,传统方法常常通过对高分辨率图像进行重采样或裁剪而丢失关键细节,我们的方法使用非均匀图像缩放网络保留了与检查相关的重要信息,如微裂纹和边缘。这一创新使得能够对高分辨率图像进行详细的损伤识别,同时显著降低计算需求。我们在本研究中的主要贡献包括:(1)提出了两个缩放网络,它们共同使得能够处理高分辨率图像,同时显著降低计算需求;(2)提出了Dmg2Former,这是一个具有Swin Transformer主干的低分辨率分割网络,它利用节省的计算资源生成详细的目视检查掩码。我们通过在公开可用的目视检查数据集上进行一系列实验来验证我们的方法,这些实验涉及诸如裂纹检测和材料识别等各种任务。最后,我们从乘法累加运算和GPU内存需求方面考察了自适应缩放器的计算效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/694c4fea7bb1/sensors-24-06007-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/51139333f361/sensors-24-06007-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/38f849901bb4/sensors-24-06007-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/2a26760179e5/sensors-24-06007-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/bd72f18387bd/sensors-24-06007-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/bca962298056/sensors-24-06007-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/d414674fa074/sensors-24-06007-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/694c4fea7bb1/sensors-24-06007-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/51139333f361/sensors-24-06007-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/38f849901bb4/sensors-24-06007-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/2a26760179e5/sensors-24-06007-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/bd72f18387bd/sensors-24-06007-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/bca962298056/sensors-24-06007-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/d414674fa074/sensors-24-06007-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cdd/11436102/694c4fea7bb1/sensors-24-06007-g007.jpg

相似文献

1
Dmg2Former-AR: Vision Transformers with Adaptive Rescaling for High-Resolution Structural Visual Inspection.Dmg2Former-AR:用于高分辨率结构视觉检测的具有自适应重缩放功能的视觉Transformer
Sensors (Basel). 2024 Sep 17;24(18):6007. doi: 10.3390/s24186007.
2
Pixel-Level Fatigue Crack Segmentation in Large-Scale Images of Steel Structures Using an Encoder-Decoder Network.基于编解码器网络的钢结构大尺寸图像像素级疲劳裂纹分割。
Sensors (Basel). 2021 Jun 16;21(12):4135. doi: 10.3390/s21124135.
3
A Transformer-Based Model for Super-Resolution of Anime Image.基于Transformer 的动漫图像超分辨率模型。
Sensors (Basel). 2022 Oct 24;22(21):8126. doi: 10.3390/s22218126.
4
SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.SwinCross:用于 PET/CT 图像中头颈部肿瘤分割的跨模态 Swin 变换器。
Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.
5
TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation.TGDAUNet:基于 Transformer 和 GCNN 的双分支注意力 U-Net 用于医学图像分割。
Comput Biol Med. 2023 Dec;167:107583. doi: 10.1016/j.compbiomed.2023.107583. Epub 2023 Oct 21.
6
A deep learning-based framework (Co-ReTr) for auto-segmentation of non-small cell-lung cancer in computed tomography images.一种基于深度学习的框架(Co-ReTr),用于在计算机断层扫描图像中对非小细胞肺癌进行自动分割。
J Appl Clin Med Phys. 2024 Mar;25(3):e14297. doi: 10.1002/acm2.14297. Epub 2024 Feb 19.
7
Ultrasound Image Analysis with Vision Transformers-Review.基于视觉Transformer的超声图像分析——综述
Diagnostics (Basel). 2024 Mar 4;14(5):542. doi: 10.3390/diagnostics14050542.
8
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition.Conv2Former:一种用于视觉识别的简单的类Transformer卷积网络。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8274-8283. doi: 10.1109/TPAMI.2024.3401450. Epub 2024 Nov 6.
9
Vision-Based Autonomous Crack Detection of Concrete Structures Using a Fully Convolutional Encoder-Decoder Network.基于视觉的混凝土结构自主裂缝检测:使用全卷积编解码网络。
Sensors (Basel). 2019 Sep 30;19(19):4251. doi: 10.3390/s19194251.
10
Semantic-Aware Fusion Network Based on Super-Resolution.基于超分辨率的语义感知融合网络
Sensors (Basel). 2024 Jun 5;24(11):3665. doi: 10.3390/s24113665.

引用本文的文献

1
Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation.基于图像金字塔结构的用于4K视频帧插值的循环流更新模型
Sensors (Basel). 2025 Jan 6;25(1):290. doi: 10.3390/s25010290.

本文引用的文献

1
Concrete Surface Crack Detection Algorithm Based on Improved YOLOv8.基于改进YOLOv8的混凝土表面裂缝检测算法
Sensors (Basel). 2024 Aug 14;24(16):5252. doi: 10.3390/s24165252.
2
Surface Defect-Extended BIM Generation Leveraging UAV Images and Deep Learning.利用无人机图像和深度学习的表面缺陷扩展建筑信息模型生成
Sensors (Basel). 2024 Jun 26;24(13):4151. doi: 10.3390/s24134151.
3
Concrete Highway Crack Detection Based on Visible Light and Infrared Silicate Spectrum Image Fusion.基于可见光与红外硅酸盐光谱图像融合的混凝土公路裂缝检测
Sensors (Basel). 2024 Apr 26;24(9):2759. doi: 10.3390/s24092759.
4
Crack Detection and Analysis of Concrete Structures Based on Neural Network and Clustering.基于神经网络和聚类的混凝土结构裂缝检测与分析
Sensors (Basel). 2024 Mar 7;24(6):1725. doi: 10.3390/s24061725.
5
An Ensemble Approach for Robust Automated Crack Detection and Segmentation in Concrete Structures.一种用于混凝土结构中稳健自动裂缝检测与分割的集成方法。
Sensors (Basel). 2024 Jan 1;24(1):257. doi: 10.3390/s24010257.
6
Review of Wireless RFID Strain Sensing Technology in Structural Health Monitoring.结构健康监测中的无线射频识别应变传感技术综述
Sensors (Basel). 2023 Aug 3;23(15):6925. doi: 10.3390/s23156925.
7
Unsupervised Learning Methods for Data-Driven Vibration-Based Structural Health Monitoring: A Review.基于数据驱动的振动结构健康监测的无监督学习方法综述。
Sensors (Basel). 2023 Mar 20;23(6):3290. doi: 10.3390/s23063290.
8
Design and Implementation of a Video-Frame Localization System for a Drifting Camera-Based Sewer Inspection System.基于漂移相机的污水检测系统中视频帧定位系统的设计与实现。
Sensors (Basel). 2023 Jan 10;23(2):793. doi: 10.3390/s23020793.
9
UNet++: A Nested U-Net Architecture for Medical Image Segmentation.U-Net++:一种用于医学图像分割的嵌套U-Net架构。
Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2018). 2018 Sep;11045:3-11. doi: 10.1007/978-3-030-00889-5_1. Epub 2018 Sep 20.
10
Smart RFID Sensors Embedded in Building Structures for Early Damage Detection and Long-Term Monitoring.智能 RFID 传感器嵌入建筑物结构中,用于早期损伤检测和长期监测。
Sensors (Basel). 2019 Dec 13;19(24):5514. doi: 10.3390/s19245514.