Department of Electrical and Computer Engineering, James Worth Bagley College of Engineering, Mississippi State University, Starkville, MS 39762, USA.
Sensors (Basel). 2022 Sep 16;22(18):7010. doi: 10.3390/s22187010.
Three-dimensional object detection is crucial for autonomous driving to understand the driving environment. Since the pooling operation causes information loss in the standard CNN, we designed a wavelet-multiresolution-analysis-based 3D object detection network without a pooling operation. Additionally, instead of using a single filter like the standard convolution, we used the lower-frequency and higher-frequency coefficients as a filter. These filters capture more relevant parts than a single filter, enlarging the receptive field. The model comprises a discrete wavelet transform (DWT) and an inverse wavelet transform (IWT) with skip connections to encourage feature reuse for contrasting and expanding layers. The IWT enriches the feature representation by fully recovering the lost details during the downsampling operation. Element-wise summation was used for the skip connections to decrease the computational burden. We trained the model for the Haar and Daubechies (Db4) wavelets. The two-level wavelet decomposition result shows that we can build a lightweight model without losing significant performance. The experimental results on KITTI's BEV and 3D evaluation benchmark show that our model outperforms the PointPillars-based model by up to 14% while reducing the number of trainable parameters.
三维目标检测对于自动驾驶理解驾驶环境至关重要。由于池化操作会导致标准 CNN 中的信息丢失,因此我们设计了一种基于小波多分辨率分析的无池化操作的 3D 目标检测网络。此外,我们没有像标准卷积那样使用单个滤波器,而是使用低频和高频系数作为滤波器。这些滤波器比单个滤波器捕捉到更多相关部分,扩大了感受野。该模型包含离散小波变换(DWT)和带有跳过连接的逆小波变换(IWT),以鼓励特征复用,用于对比和扩展层。IWT 通过完全恢复下采样操作中丢失的细节来丰富特征表示。跳过连接使用元素级求和来降低计算负担。我们针对 Haar 和 Daubechies(Db4)小波训练了该模型。两级小波分解结果表明,我们可以在不损失显著性能的情况下构建轻量级模型。在 KITTI 的 BEV 和 3D 评估基准上的实验结果表明,我们的模型在减少可训练参数的同时,比基于 PointPillars 的模型性能提高了 14%。