Shao Yongxin, Sun Zhetao, Tan Aihong, Yan Tianhong
College of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou, China.
Front Neurorobot. 2023 Feb 16;17:1092564. doi: 10.3389/fnbot.2023.1092564. eCollection 2023.
Lidar-based 3D object detection and classification is a critical task for autonomous driving. However, inferencing from exceedingly sparse 3D data in real-time is a formidable challenge. Complex-YOLO solves the problem of point cloud disorder and sparsity by projecting it onto the bird's-eye view and realizes real-time 3D object detection based on LiDAR. However, Complex-YOLO has no object height detection, a shallow network depth, and poor small-size object detection accuracy. To address these issues, this paper has made the following improvements: (1) adds a multi-scale feature fusion network to improve the algorithm's capability to detect small-size objects; (2) uses a more advanced RepVGG as the backbone network to improve network depth and overall detection performance; and (3) adds an effective height detector to the network to improve the height detection. Through experiments, we found that our algorithm's accuracy achieved good performance on the KITTI dataset, while the detection speed and memory usage were very superior, 48FPS on RTX3070Ti and 20FPS on GTX1060, with a memory usage of 841Mib.
基于激光雷达的3D目标检测与分类是自动驾驶中的一项关键任务。然而,从极其稀疏的3D数据中进行实时推理是一项艰巨的挑战。Complex-YOLO通过将点云投影到鸟瞰图来解决点云无序和稀疏的问题,并实现了基于激光雷达的实时3D目标检测。然而,Complex-YOLO没有目标高度检测,网络深度较浅,小尺寸目标检测精度较差。为了解决这些问题,本文进行了以下改进:(1)添加多尺度特征融合网络以提高算法检测小尺寸目标的能力;(2)使用更先进的RepVGG作为主干网络以提高网络深度和整体检测性能;(3)在网络中添加有效的高度检测器以改善高度检测。通过实验,我们发现我们算法的精度在KITTI数据集上取得了良好的性能,同时检测速度和内存使用情况非常出色,在RTX3070Ti上为48FPS,在GTX1060上为20FPS,内存使用量为841Mib。