Thaker Keval, Chennupati Sumanth, Rawashdeh Nathir, Rawashdeh Samir A
Electrical and Computer Engineering, University of Michigan-Dearborn, Dearborn, MI 48128, USA.
Department of Applied Computing, Michigan Technological University, Houghton, MI 49931, USA.
J Imaging. 2023 Dec 31;10(1):12. doi: 10.3390/jimaging10010012.
Despite significant strides in achieving vehicle autonomy, robust perception under low-light conditions still remains a persistent challenge. In this study, we investigate the potential of multispectral imaging, thereby leveraging deep learning models to enhance object detection performance in the context of nighttime driving. Features encoded from the red, green, and blue (RGB) visual spectrum and thermal infrared images are combined to implement a multispectral object detection model. This has proven to be more effective compared to using visual channels only, as thermal images provide complementary information when discriminating objects in low-illumination conditions. Additionally, there is a lack of studies on effectively fusing these two modalities for optimal object detection performance. In this work, we present a framework based on the Faster R-CNN architecture with a feature pyramid network. Moreover, we design various fusion approaches using concatenation and addition operators at varying stages of the network to analyze their impact on object detection performance. Our experimental results on the KAIST and FLIR datasets show that our framework outperforms the baseline experiments of the unimodal input source and the existing multispectral object detectors.
尽管在实现车辆自主性方面取得了重大进展,但在低光照条件下的稳健感知仍然是一个持续存在的挑战。在本研究中,我们研究了多光谱成像的潜力,从而利用深度学习模型在夜间驾驶环境中提高目标检测性能。将从红、绿、蓝(RGB)视觉光谱和热红外图像中编码的特征进行组合,以实现多光谱目标检测模型。事实证明,与仅使用视觉通道相比,这更有效,因为热图像在低光照条件下辨别物体时提供了补充信息。此外,缺乏关于有效融合这两种模态以实现最佳目标检测性能的研究。在这项工作中,我们提出了一个基于带有特征金字塔网络的Faster R-CNN架构的框架。此外,我们在网络的不同阶段使用拼接和加法运算符设计了各种融合方法,以分析它们对目标检测性能的影响。我们在KAIST和FLIR数据集上的实验结果表明,我们的框架优于单峰输入源的基线实验和现有的多光谱目标检测器。