Suppr超能文献

光谱GPT:光谱遥感基础模型。

SpectralGPT: Spectral Remote Sensing Foundation Model.

作者信息

Hong Danfeng, Zhang Bing, Li Xuyang, Li Yuxuan, Li Chenyu, Yao Jing, Yokoya Naoto, Li Hao, Ghamisi Pedram, Jia Xiuping, Plaza Antonio, Gamba Paolo, Benediktsson Jon Atli, Chanussot Jocelyn

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Aug;46(8):5227-5244. doi: 10.1109/TPAMI.2024.3362475. Epub 2024 Jul 2.

Abstract

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS Big Data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; and 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS Big Data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

摘要

基础模型最近因其有潜力以自监督方式彻底改变视觉表征学习领域而备受关注。虽然大多数基础模型都是为有效处理用于各种视觉任务的RGB图像而量身定制的,但在专注于光谱数据的研究方面存在明显差距,光谱数据为场景理解提供了有价值的信息,尤其是在遥感(RS)应用中。为了填补这一差距,我们首次创建了一个通用的RS基础模型,名为SpectralGPT,它专门用于使用新型3D生成预训练变压器(GPT)来处理光谱RS图像。与现有基础模型相比,SpectralGPT:1)以渐进训练方式适应具有不同大小、分辨率、时间序列和区域的输入图像,从而能够充分利用大量的RS大数据;2)利用3D令牌生成进行空间光谱耦合;3)通过多目标重建捕获光谱序列模式;4)在100万张光谱RS图像上进行训练,生成具有超过6亿个参数的模型。我们的评估突出了预训练的SpectralGPT模型在性能上的显著提升,这表明在推进地球科学领域内光谱RS大数据应用方面,在四个下游任务(单/多标签场景分类、语义分割和变化检测)中具有巨大潜力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验