基于红外和可见光图像的双 YOLO 目标检测架构。

Dual-YOLO Architecture from Infrared and Visible Images for Object Detection.

机构信息

Bionic Robot Key Laboratory of Ministry of Education, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China.

Yangtze Delta Region Academy, Beijing Institute of Technology, Jiaxing 314003, China.

出版信息

Sensors (Basel). 2023 Mar 8;23(6):2934. doi: 10.3390/s23062934.

DOI:10.3390/s23062934

PMID:36991645

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10055770/

Abstract

With the development of infrared detection technology and the improvement of military remote sensing needs, infrared object detection networks with low false alarms and high detection accuracy have been a research focus. However, due to the lack of texture information, the false detection rate of infrared object detection is high, resulting in reduced object detection accuracy. To solve these problems, we propose an infrared object detection network named Dual-YOLO, which integrates visible image features. To ensure the speed of model detection, we choose the You Only Look Once v7 (YOLOv7) as the basic framework and design the infrared and visible images dual feature extraction channels. In addition, we develop attention fusion and fusion shuffle modules to reduce the detection error caused by redundant fusion feature information. Moreover, we introduce the Inception and SE modules to enhance the complementary characteristics of infrared and visible images. Furthermore, we design the fusion loss function to make the network converge fast during training. The experimental results show that the proposed Dual-YOLO network reaches 71.8% mean Average Precision (mAP) in the DroneVehicle remote sensing dataset and 73.2% mAP in the KAIST pedestrian dataset. The detection accuracy reaches 84.5% in the FLIR dataset. The proposed architecture is expected to be applied in the fields of military reconnaissance, unmanned driving, and public safety.

摘要

随着红外探测技术的发展和军事遥感需求的提高，具有低误报率和高检测精度的红外目标检测网络一直是研究热点。然而，由于缺乏纹理信息，红外目标检测的误检率较高，导致目标检测精度降低。为了解决这些问题，我们提出了一种名为 Dual-YOLO 的红外目标检测网络，该网络集成了可见光图像特征。为了保证模型检测的速度，我们选择了 You Only Look Once v7（YOLOv7）作为基本框架，并设计了红外和可见光图像的双特征提取通道。此外，我们开发了注意力融合和融合洗牌模块，以减少冗余融合特征信息引起的检测误差。此外，我们引入了 Inception 和 SE 模块来增强红外和可见光图像的互补特征。此外，我们设计了融合损失函数，以使网络在训练过程中快速收敛。实验结果表明，所提出的 Dual-YOLO 网络在 DroneVehicle 遥感数据集上达到了 71.8%的平均准确率（mAP），在 KAIST 行人数据集上达到了 73.2%的 mAP。在 FLIR 数据集上，检测准确率达到 84.5%。该架构有望应用于军事侦察、无人驾驶和公共安全等领域。