用于高光谱图像分类的融合双分支局部-全局特征的CViT弱监督网络

CViT Weakly Supervised Network Fusing Dual-Branch Local-Global Features for Hyperspectral Image Classification.

作者信息

Fu Wentao, Sun Xiyan, Zhang Xiuhua, Ji Yuanfa, Zhang Jiayuan

机构信息

School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China.

Beidou Navigation Technology Center, Guangxi Institute of Industrial Technology for Space-Time Information, Nanning 530201, China.

出版信息

Entropy (Basel). 2025 Aug 15;27(8):869. doi: 10.3390/e27080869.

DOI:10.3390/e27080869

PMID:40870341

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12385884/

Abstract

In hyperspectral image (HSI) classification, feature learning and label accuracy play a crucial role. In actual hyperspectral scenes, however, noisy labels are unavoidable and seriously impact the performance of methods. While deep learning has achieved remarkable results in HSI classification tasks, its noise-resistant performance usually comes at the cost of feature representation capabilities. High-dimensional and deep convolution can capture rich deep semantic features, but with high complexity and resource consumption. To deal with these problems, we propose a CViT Weakly Supervised Network (CWSN) for HSI classification. Specifically, a lightweight 1D-2D two-branch network is used for local generalization and enhancement of spatial-spectral features. Then, the fusion and characterization of local and global features are achieved through the CNN-Vision Transformer (CViT) cascade strategy. The experimental results on four benchmark HSI datasets show that CWSN has good anti-noise ability and ensures the robustness and versatility of the network facing both clean and noisy training sets. Compared to other methods, the CWSN has better classification accuracy.

摘要

在高光谱图像（HSI）分类中，特征学习和标签准确性起着至关重要的作用。然而，在实际的高光谱场景中，噪声标签是不可避免的，并且会严重影响方法的性能。虽然深度学习在HSI分类任务中取得了显著成果，但其抗噪声性能通常是以特征表示能力为代价的。高维和深度卷积可以捕获丰富的深度语义特征，但具有高复杂性和资源消耗。为了解决这些问题，我们提出了一种用于HSI分类的CViT弱监督网络（CWSN）。具体来说，一个轻量级的1D-2D双分支网络用于局部泛化和空间光谱特征增强。然后，通过CNN-视觉Transformer（CViT）级联策略实现局部和全局特征的融合与表征。在四个基准HSI数据集上的实验结果表明，CWSN具有良好的抗噪声能力，并确保了网络面对干净和有噪声训练集时的鲁棒性和通用性。与其他方法相比，CWSN具有更好的分类准确性。