Suppr超能文献

基于注意力的上下文感知网络,用于对航空场景进行语义理解。

Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery.

机构信息

School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.

出版信息

Sensors (Basel). 2021 Mar 11;21(6):1983. doi: 10.3390/s21061983.

Abstract

It is essential for researchers to have a proper interpretation of remote sensing images (RSIs) and precise semantic labeling of their component parts. Although FCN (Fully Convolutional Networks)-like deep convolutional network architectures have been widely applied in the perception of autonomous cars, there are still two challenges in the semantic segmentation of RSIs. The first is to identify details in high-resolution images with complex scenes and to solve the class-mismatch issues; the second is to capture the edge of objects finely without being confused by the surroundings. HRNET has the characteristics of maintaining high-resolution representation by fusing feature information with parallel multi-resolution convolution branches. We adopt HRNET as a backbone and propose to incorporate the Class-Oriented Region Attention Module (CRAM) and Class-Oriented Context Fusion Module (CCFM) to analyze the relationships between classes and patch regions and between classes and local or global pixels, respectively. Thus, the perception capability of the model for the detailed part in the aerial image can be enhanced. We leverage these modules to develop an end-to-end semantic segmentation model for aerial images and validate it on the ISPRS Potsdam and Vaihingen datasets. The experimental results show that our model improves the baseline accuracy and outperforms some commonly used CNN architectures.

摘要

对于研究人员来说,正确解释遥感图像(RSIs)并对其组成部分进行精确的语义标注至关重要。尽管类似于 FCN(全卷积网络)的深度卷积网络架构已广泛应用于自动驾驶汽车的感知中,但 RSIs 的语义分割仍然存在两个挑战。第一个挑战是识别具有复杂场景的高分辨率图像中的细节,并解决类不匹配问题;第二个挑战是精细地捕捉物体的边缘,而不会被周围环境混淆。HRNET 具有通过并行多分辨率卷积分支融合特征信息来保持高分辨率表示的特点。我们采用 HRNET 作为骨干,并提出将面向类的区域注意力模块(CRAM)和面向类的上下文融合模块(CCFM)分别用于分析类与斑块区域之间以及类与局部或全局像素之间的关系。这样可以增强模型对航空图像中细节部分的感知能力。我们利用这些模块开发了一个用于航空图像的端到端语义分割模型,并在 ISPRS Potsdam 和 Vaihingen 数据集上进行了验证。实验结果表明,我们的模型提高了基线准确性,优于一些常用的 CNN 架构。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1250/8002143/06cf3f89f737/sensors-21-01983-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验