基于跨模态交叉注意力的多光谱遥感图像目标检测

Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention.

作者信息

Zhao Pujie, Ye Xia, Du Ziang

机构信息

Xi'an Research Institute of High-Tech, Xi'an 710025, China.

出版信息

Sensors (Basel). 2024 Jun 24;24(13):4098. doi: 10.3390/s24134098.

DOI:10.3390/s24134098

PMID:39000877

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11244361/

Abstract

In complex environments a single visible image is not good enough to perceive the environment, this paper proposes a novel dual-stream real-time detector designed for target detection in extreme environments such as nighttime and fog, which is able to efficiently utilise both visible and infrared images to achieve Fast All-Weatherenvironment sensing (FAWDet). Firstly, in order to allow the network to process information from different modalities simultaneously, this paper expands the state-of-the-art end-to-end detector YOLOv8, the backbone is expanded in parallel as a dual stream. Then, for purpose of avoid information loss in the process of network deepening, a cross-modal feature enhancement module is designed in this study, which enhances each modal feature by cross-modal attention mechanisms, thus effectively avoiding information loss and improving the detection capability of small targets. In addition, for the significant differences between modal features, this paper proposes a three-stage fusion strategy to optimise the feature integration through the fusion of spatial, channel and overall dimensions. It is worth mentioning that the cross-modal feature fusion module adopts an end-to-end training approach. Extensive experiments on two datasets validate that the proposed method achieves state-of-the-art performance in detecting small targets. The cross-modal real-time detector in this study not only demonstrates excellent stability and robust detection performance, but also provides a new solution for target detection techniques in extreme environments.

摘要

在复杂环境中，单张可见光图像不足以感知环境，本文提出了一种新颖的双流实时检测器，专为夜间和雾天等极端环境下的目标检测而设计，它能够有效利用可见光和红外图像实现快速全天气环境感知（FAWDet）。首先，为了使网络能够同时处理来自不同模态的信息，本文扩展了当前最先进的端到端检测器YOLOv8，将主干网络并行扩展为双流。然后，为了避免在网络加深过程中信息丢失，本研究设计了一个跨模态特征增强模块，通过跨模态注意力机制增强每个模态特征，从而有效避免信息丢失并提高小目标的检测能力。此外，针对模态特征之间的显著差异，本文提出了一种三阶段融合策略，通过空间、通道和整体维度的融合来优化特征整合。值得一提的是，跨模态特征融合模块采用端到端训练方法。在两个数据集上进行的大量实验验证了所提方法在检测小目标方面达到了当前最先进的性能。本研究中的跨模态实时检测器不仅展示了出色的稳定性和鲁棒的检测性能，还为极端环境下的目标检测技术提供了一种新的解决方案。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于跨模态交叉注意力的多光谱遥感图像目标检测

Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

基于跨模态交叉注意力的多光谱遥感图像目标检测

Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention.

作者信息

机构信息

出版信息

相似文献

本文引用的文献