Suppr超能文献

HETMCL:用于遥感场景分类的高频增强变压器和多层上下文学习网络

HETMCL: High-Frequency Enhancement Transformer and Multi-Layer Context Learning Network for Remote Sensing Scene Classification.

作者信息

Xu Haiyan, Song Yanni, Xu Gang, Wu Ke, Wen Jianguang

机构信息

Zhejiang College of Security Technology, Wenzhou 325000, China.

Wenzhou Future City Research Institute, Wenzhou 325000, China.

出版信息

Sensors (Basel). 2025 Jun 17;25(12):3769. doi: 10.3390/s25123769.

Abstract

Remote Sensing Scene Classification (RSSC) is an important and challenging research topic. Transformer-based methods have shown encouraging performance in capturing global dependencies. However, recent studies have revealed that Transformers perform poorly in capturing high frequencies that mainly convey local information. To solve this problem, we propose a novel method based on High-Frequency Enhanced Vision Transformer and Multi-Layer Context Learning (HETMCL), which can effectively learn the comprehensive features of high-frequency and low-frequency information in visual data. First, Convolutional Neural Networks (CNNs) extract low-level spatial structures, and the Adjacent Layer Feature Fusion Module (AFFM) reduces semantic gaps between layers to enhance spatial context. Second, the High-Frequency Information Enhancement Vision Transformer (HFIE) includes a High-to-Low-Frequency Token Mixer (HLFTM), which captures high-frequency details. Finally, the Multi-Layer Context Alignment Attention (MCAA) integrates multi-layer features and contextual relationships. On UCM, AID, and NWPU datasets, HETMCL achieves state-of-the-art OA of 99.76%, 97.32%, and 95.02%, respectively, outperforming existing methods by up to 0.38%.

摘要

遥感场景分类(RSSC)是一个重要且具有挑战性的研究课题。基于Transformer的方法在捕捉全局依赖性方面表现出了令人鼓舞的性能。然而,最近的研究表明,Transformer在捕捉主要传达局部信息的高频信息方面表现不佳。为了解决这个问题,我们提出了一种基于高频增强视觉Transformer和多层上下文学习(HETMCL)的新方法,该方法可以有效地学习视觉数据中高频和低频信息的综合特征。首先,卷积神经网络(CNN)提取低级空间结构,相邻层特征融合模块(AFFM)减少层间语义差距以增强空间上下文。其次,高频信息增强视觉Transformer(HFIE)包括一个从高到低频令牌混合器(HLFTM),用于捕捉高频细节。最后,多层上下文对齐注意力(MCAA)整合多层特征和上下文关系。在UCM、AID和NWPU数据集上,HETMCL分别实现了99.76%、97.32%和95.02%的最优精度(OA),比现有方法高出0.38%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c0a/12196739/a8e90d54d86d/sensors-25-03769-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验