Alldieck Thiemo, Bahnsen Chris H, Moeslund Thomas B
Visual Analysis of People Lab, Aalborg University, 9000 Aalborg, Denmark.
Sensors (Basel). 2016 Nov 18;16(11):1947. doi: 10.3390/s16111947.
In order to enable a robust 24-h monitoring of traffic under changing environmental conditions, it is beneficial to observe the traffic scene using several sensors, preferably from different modalities. To fully benefit from multi-modal sensor output, however, one must fuse the data. This paper introduces a new approach for fusing color RGB and thermal video streams by using not only the information from the videos themselves, but also the available contextual information of a scene. The contextual information is used to judge the quality of a particular modality and guides the fusion of two parallel segmentation pipelines of the RGB and thermal video streams. The potential of the proposed context-aware fusion is demonstrated by extensive tests of quantitative and qualitative characteristics on existing and novel video datasets and benchmarked against competing approaches to multi-modal fusion.
为了在不断变化的环境条件下实现对交通的稳健24小时监测,使用多个传感器(最好是来自不同模态)来观察交通场景是有益的。然而,为了充分利用多模态传感器输出,必须对数据进行融合。本文介绍了一种融合彩色RGB和热视频流的新方法,该方法不仅使用来自视频本身的信息,还使用场景的可用上下文信息。上下文信息用于判断特定模态的质量,并指导RGB和热视频流的两个并行分割管道的融合。通过对现有和新颖视频数据集的定量和定性特征进行广泛测试,并与多模态融合的竞争方法进行基准测试,证明了所提出的上下文感知融合的潜力。