Tang Yizhuo, Cao Zhengtao, Guo Ningbo, Jiang Mingyong
Space Engineering University, Beijing, China.
Sci Rep. 2024 Feb 25;14(1):4577. doi: 10.1038/s41598-024-54096-8.
The problem of change detection in remote sensing image processing is both difficult and important. It is extensively used in a variety of sectors, including land resource planning, monitoring and forecasting of agricultural plant health, and monitoring and assessment of natural disasters. Remote sensing images provide a large amount of long-term and fully covered data for earth environmental monitoring. A lot of progress has been made thanks to deep learning's quick development. But the majority of deep learning-based change detection techniques currently in use rely on the well-known Convolutional neural network (CNN). However, considering the locality of convolutional operation, CNN unable to master the interplay between global and distant semantic information. Some researches has employ Vision Transformer as a backbone in remote sensing field. Inspired by these researches, in this paper, we propose a network named Siam-Swin-Unet, which is a Siamesed pure Transformer with U-shape construction for remote sensing image change detection. Swin Transformer is a hierarchical vision transformer with shifted windows that can extract global feature. To learn local and global semantic feature information, the dual-time image are fed into Siam-Swin-Unet which is composed of Swin Transformer, Unet Siamesenet and two feature fusion module. Considered the Unet and Siamesenet are effective for change detection, We applied it to the model. The feature fusion module is designed for fusion of dual-time image features, and is efficient and low-compute confirmed by our experiments. Our network achieved 94.67 F1 on the CDD dataset (season varying).
遥感图像处理中的变化检测问题既困难又重要。它被广泛应用于多个领域,包括土地资源规划、农业植物健康监测与预测以及自然灾害监测与评估。遥感图像为地球环境监测提供了大量长期且全面覆盖的数据。由于深度学习的快速发展,已经取得了很多进展。但是目前大多数基于深度学习的变化检测技术都依赖于著名的卷积神经网络(CNN)。然而,考虑到卷积操作的局部性,CNN无法掌握全局和远距离语义信息之间的相互作用。一些研究已经在遥感领域采用视觉Transformer作为主干。受这些研究的启发,在本文中,我们提出了一种名为Siam-Swin-Unet的网络,它是一种用于遥感图像变化检测的具有U形结构的连体纯Transformer。Swin Transformer是一种具有移动窗口的分层视觉Transformer,可以提取全局特征。为了学习局部和全局语义特征信息,将双时相图像输入到由Swin Transformer、Unet Siamesenet和两个特征融合模块组成的Siam-Swin-Unet中。考虑到Unet和Siamesenet在变化检测方面是有效的,我们将其应用于模型中。特征融合模块旨在融合双时相图像特征,并且我们的实验证实了它是高效且低计算量的。我们的网络在CDD数据集(季节变化)上达到了94.67的F1值。