基于曼巴视觉变换器的轻型车辆检测

Lightweight Vehicle Detection Based on Mamba_ViT.

作者信息

Song Ze, Wang Yuhai, Xu Shuobo, Wang Peng, Liu Lele

机构信息

School of Information and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China.

出版信息

Sensors (Basel). 2024 Nov 6;24(22):7138. doi: 10.3390/s24227138.

DOI:10.3390/s24227138

PMID:39598916

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11598317/

Abstract

Vehicle detection algorithms are essential for intelligent traffic management and autonomous driving systems. Current vehicle detection algorithms largely rely on deep learning techniques, enabling the automatic extraction of vehicle image features through convolutional neural networks (CNNs). However, in real traffic scenarios, relying only on a single feature extraction unit makes it difficult to fully understand the vehicle information in the traffic scenario, thus affecting the vehicle detection effect. To address this issue, we propose a lightweight vehicle detection algorithm based on Mamba_ViT. First, we introduce a new feature extraction architecture (Mamba_ViT) that separates shallow and deep features and processes them independently to obtain a more complete contextual representation, ensuring comprehensive and accurate feature extraction. Additionally, a multi-scale feature fusion mechanism is employed to enhance the integration of shallow and deep features, leading to the development of a vehicle detection algorithm named Mamba_ViT_YOLO. The experimental results on the UA-DETRAC dataset show that our proposed algorithm improves mAP@50 by 3.2% compared to the latest YOLOv8 algorithm, while using only 60% of the model parameters.

摘要

车辆检测算法对于智能交通管理和自动驾驶系统至关重要。当前的车辆检测算法很大程度上依赖于深度学习技术，能够通过卷积神经网络（CNN）自动提取车辆图像特征。然而，在实际交通场景中，仅依靠单个特征提取单元很难全面理解交通场景中的车辆信息，从而影响车辆检测效果。为了解决这个问题，我们提出了一种基于Mamba_ViT的轻量级车辆检测算法。首先，我们引入了一种新的特征提取架构（Mamba_ViT），该架构将浅层和深层特征分离并独立处理，以获得更完整的上下文表示，确保全面准确的特征提取。此外，采用多尺度特征融合机制来增强浅层和深层特征的融合，从而开发出一种名为Mamba_ViT_YOLO的车辆检测算法。在UA-DETRAC数据集上的实验结果表明，我们提出的算法与最新的YOLOv8算法相比，mAP@50提高了3.2%，而模型参数仅使用了60%。