Suppr超能文献

改进从高分辨率航空图像中提取建筑物的方法:在Inria数据集上使用深度学习进行误差校正和性能增强

Improving building extraction from high-resolution aerial images: Error correction and performance enhancement using deep learning on the Inria dataset.

作者信息

Ekiz Serdar, Acar Ugur

机构信息

Geomatic Engineering Department, Yildiz Technical University, İstanbul, Turkey.

出版信息

Sci Prog. 2025 Jan-Mar;108(1):368504251318202. doi: 10.1177/00368504251318202.

Abstract

Extracting buildings from images is crucial for urban management, urban planning, and post-disaster change detection. Over the years, various approaches have been tried, but the recent application of deep learning has greatly improved the success of such studies. In this study, the Inria dataset was used, consisting of 180 high-resolution aerial images.The study compared the performance of various architectures. DeepLabv3+ emerged as the most successful, with Accuracy, IoU, and F1 Scores of 96.77%, 89.85%, and 94.53%, respectively. Attention U-Net followed, scoring 95.31%, 85.49%, and 91.95%. U-Net, tested with different encoders, achieved average results of 97.22%, 84.78%, and 90.79%. SE-ResNeXt-50 was the best-performing encoder, followed by SE-ResNet-50, ResNeXt-50, and ResNet-50. UNet++ achieved 94.48% Accuracy, 83.09% IoU, and 90.45% F1 Score, while U2Net obtained 94.09%, 82.26%, and 89.88%, making them less successful.When examining the models under challenging conditions, SE-ResNeXt-50 was the most robust, successfully handling scenarios like occlusion by trees and complex indoor gardens. Conversely, Attention U-Net and UNet++ were more prone to errors, particularly when vehicles were parked near buildings or in the presence of shipping containers, where false positives were common. ResNet-50 struggled with concrete gardens, while U2Net showed better results in scenarios involving indoor gardens.These results, compared to other studies using the same dataset with different pixel sizes, show that eliminating erroneous data and resizing images can enhance the performance of deep learning networks. Therefore, by refining the data and adjusting the image sizes, models can make more accurate and efficient building detections.

摘要

从图像中提取建筑物对于城市管理、城市规划和灾后变化检测至关重要。多年来,人们尝试了各种方法,但深度学习的近期应用极大地提高了此类研究的成功率。在本研究中,使用了Inria数据集,该数据集由180张高分辨率航空图像组成。该研究比较了各种架构的性能。DeepLabv3+表现最为成功,准确率、交并比(IoU)和F1分数分别为96.77%、89.85%和94.53%。Attention U-Net紧随其后,得分分别为95.31%、85.49%和91.95%。使用不同编码器测试的U-Net取得了97.22%、84.78%和90.79%的平均结果。SE-ResNeXt-50是表现最佳的编码器,其次是SE-ResNet-50、ResNeXt-50和ResNet-50。UNet++的准确率为94.48%,交并比为83.09%,F1分数为90.45%,而U2Net的准确率为94.09%,交并比为82.26%,F1分数为89.88%,它们的成功率较低。在具有挑战性的条件下检查模型时,SE-ResNeXt-50最为稳健,成功处理了树木遮挡和复杂室内花园等场景。相反,Attention U-Net和UNet++更容易出错,特别是当车辆停在建筑物附近或有集装箱时,误报很常见。ResNet-50在混凝土花园场景中表现不佳,而U2Net在涉及室内花园的场景中显示出更好的结果。与使用相同数据集但不同像素大小的其他研究相比,这些结果表明,消除错误数据和调整图像大小可以提高深度学习网络的性能。因此,通过优化数据和调整图像大小,模型可以进行更准确、高效的建筑物检测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/688a/11822834/997bba51182f/10.1177_00368504251318202-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验