用于高光谱图像分类的统一模型多尺度特征学习

Multiscale Feature-Learning with a Unified Model for Hyperspectral Image Classification.

作者信息

Arshad Tahir, Zhang Junping, Ullah Inam, Ghadi Yazeed Yasin, Alfarraj Osama, Gafar Amr

机构信息

School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China.

Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea.

出版信息

Sensors (Basel). 2023 Sep 3;23(17):7628. doi: 10.3390/s23177628.

DOI:10.3390/s23177628

PMID:37688086

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10490724/

Abstract

In the realm of hyperspectral image classification, the pursuit of heightened accuracy and comprehensive feature extraction has led to the formulation of an advance architectural paradigm. This study proposed a model encapsulated within the framework of a unified model, which synergistically leverages the capabilities of three distinct branches: the swin transformer, convolutional neural network, and encoder-decoder. The main objective was to facilitate multiscale feature learning, a pivotal facet in hyperspectral image classification, with each branch specializing in unique facets of multiscale feature extraction. The swin transformer, recognized for its competence in distilling long-range dependencies, captures structural features across different scales; simultaneously, convolutional neural networks undertake localized feature extraction, engendering nuanced spatial information preservation. The encoder-decoder branch undertakes comprehensive analysis and reconstruction, fostering the assimilation of both multiscale spectral and spatial intricacies. To evaluate our approach, we conducted experiments on publicly available datasets and compared the results with state-of-the-art methods. Our proposed model obtains the best classification result compared to others. Specifically, overall accuracies of 96.87%, 98.48%, and 98.62% were obtained on the Xuzhou, Salinas, and LK datasets.

摘要

在高光谱图像分类领域，对更高精度和全面特征提取的追求催生了一种先进的架构范式。本研究提出了一个封装在统一模型框架内的模型，该模型协同利用了三个不同分支的能力：Swin变压器、卷积神经网络和编码器-解码器。主要目标是促进多尺度特征学习，这是高光谱图像分类中的一个关键方面，每个分支专门负责多尺度特征提取的独特方面。Swin变压器以其提取长距离依赖关系的能力而闻名，能够捕捉不同尺度的结构特征；同时，卷积神经网络进行局部特征提取，保留细微的空间信息。编码器-解码器分支进行全面分析和重建，促进多尺度光谱和空间复杂性的融合。为了评估我们的方法，我们在公开可用的数据集上进行了实验，并将结果与现有最佳方法进行了比较。与其他方法相比，我们提出的模型获得了最佳分类结果。具体而言，在徐州、萨利纳斯和LK数据集上分别获得了96.87%、98.48%和98.62%的总体准确率。