Suppr超能文献

ChangEn2:多时态遥感生成式变化基础模型

Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model.

作者信息

Zheng Zhuo, Ermon Stefano, Kim Dongjun, Zhang Liangpei, Zhong Yanfei

出版信息

IEEE Trans Pattern Anal Mach Intell. 2025 Feb;47(2):725-741. doi: 10.1109/TPAMI.2024.3475824. Epub 2025 Jan 9.

Abstract

Our understanding of the temporal dynamics of the Earth's surface has been significantly advanced by deep vision models, which often require a massive amount of labeled multi-temporal images for training. However, collecting, preprocessing, and annotating multi-temporal remote sensing images at scale is non-trivial since it is expensive and knowledge-intensive. In this paper, we present scalable multi-temporal change data generators based on generative models, which are cheap and automatic, alleviating these data problems. Our main idea is to simulate a stochastic change process over time. We describe the stochastic change process as a probabilistic graphical model, namely the generative probabilistic change model (GPCM), which factorizes the complex simulation problem into two more tractable sub-problems, i.e., condition-level change event simulation and image-level semantic change synthesis. To solve these two problems, we present Changen2, a GPCM implemented with a resolution-scalable diffusion transformer which can generate time series of remote sensing images and corresponding semantic and change labels from labeled and even unlabeled single-temporal images. Changen2 is a "generative change foundation model" that can be trained at scale via self-supervision, and is capable of producing change supervisory signals from unlabeled single-temporal images. Unlike existing "foundation models", our generative change foundation model synthesizes change data to train task-specific foundation models for change detection. The resulting model possesses inherent zero-shot change detection capabilities and excellent transferability. Comprehensive experiments suggest Changen2 has superior spatiotemporal scalability in data generation, e.g., Changen2 model trained on 256 pixel single-temporal images can yield time series of any length and resolutions of 1,024 pixels. Changen2 pre-trained models exhibit superior zero-shot performance (narrowing the performance gap to 3% on LEVIR-CD and approximately 10% on both S2Looking and SECOND, compared to fully supervised counterpart) and transferability across multiple types of change tasks, including ordinary and off-nadir building change, land-use/land-cover change, and disaster assessment.

摘要

深度视觉模型极大地推动了我们对地球表面时间动态的理解,这类模型通常需要大量带标签的多时间序列图像进行训练。然而,大规模收集、预处理和标注多时间序列遥感图像并非易事,因为这既昂贵又需要专业知识。在本文中,我们提出了基于生成模型的可扩展多时间序列变化数据生成器,该生成器成本低廉且自动化,缓解了这些数据问题。我们的主要思路是模拟随时间变化的随机过程。我们将随机变化过程描述为概率图模型,即生成概率变化模型(GPCM),它将复杂的模拟问题分解为两个更易于处理的子问题,即条件级变化事件模拟和图像级语义变化合成。为了解决这两个问题,我们提出了Changen2,这是一个用分辨率可扩展的扩散变换器实现的GPCM,它可以从带标签甚至无标签的单时间序列图像生成遥感图像的时间序列以及相应的语义和变化标签。Changen2是一个“生成式变化基础模型”,可以通过自监督进行大规模训练,并且能够从未标注的单时间序列图像中生成变化监督信号。与现有的“基础模型”不同,我们的生成式变化基础模型合成变化数据以训练用于变化检测的特定任务基础模型。所得模型具有固有的零样本变化检测能力和出色的可迁移性。综合实验表明,Changen2在数据生成方面具有卓越的时空可扩展性,例如,在256像素单时间序列图像上训练的Changen2模型可以生成任意长度的时间序列和1024像素分辨率的图像。Changen2预训练模型表现出卓越的零样本性能(与完全监督的对应模型相比,在LEVIR-CD上性能差距缩小到3%,在S2Looking和SECOND上均缩小到约10%)以及跨多种变化任务的可迁移性,包括普通和斜距建筑物变化、土地利用/土地覆盖变化以及灾害评估。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验