Department of Mechanical Engineering, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China.
Department of Armament Science and Technology, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China.
Sensors (Basel). 2020 Mar 16;20(6):1653. doi: 10.3390/s20061653.
Multi-object tracking (MOT) plays a crucial role in various platforms. Occlusion and insertion among targets, complex backgrounds and higher real-time requirements increase the difficulty of MOT problems. Most state-of-the-art MOT approaches adopt the tracking-by-detection strategy, which relies on compute-intensive sliding windows or anchoring schemes to detect matching targets or candidates in each frame. In this work, we introduce a more efficient and effective spatial-temporal attention scheme to track multiple objects in various scenarios. Using a semantic-feature-based spatial attention mechanism and a novel Motion Model, we address the insertion and location of candidates. Some online-learned target-specific convolutional neural networks (CNNs) were used to estimate target occlusion and classify by adapting the appearance model. A temporal attention mechanism was adopted to update the online module by balancing current and history frames. Extensive experiments were performed on Karlsruhe Institute of Technologyand Toyota Technological Institute (KITTI) benchmarks and an Armored Target Tracking Dataset (ATTD) built for ground-armored targets. Experimental results show that the proposed method achieved outstanding tracking performance and met the actual application requirements.
多目标跟踪(MOT)在各种平台中起着至关重要的作用。目标之间的遮挡和插入、复杂的背景以及更高的实时要求增加了 MOT 问题的难度。大多数最先进的 MOT 方法采用基于跟踪的检测策略,该策略依赖于计算密集型滑动窗口或锚定方案来检测每一帧中的匹配目标或候选目标。在这项工作中,我们引入了一种更高效、更有效的时空注意方案,以在各种场景中跟踪多个目标。我们使用基于语义特征的空间注意机制和一种新的运动模型来解决候选目标的插入和位置问题。一些在线学习的目标特定卷积神经网络(CNN)被用于通过适应外观模型来估计目标遮挡和分类。采用时间注意机制通过平衡当前帧和历史帧来更新在线模块。我们在卡尔斯鲁厄理工学院(KITTI)基准和为地面装甲目标构建的装甲目标跟踪数据集(ATTD)上进行了广泛的实验。实验结果表明,所提出的方法实现了出色的跟踪性能,满足了实际应用的要求。