Fan Jialue, Xu Wei, Wu Ying, Gong Yihong
Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208 USA.
IEEE Trans Neural Netw. 2010 Oct;21(10):1610-23. doi: 10.1109/TNN.2010.2066286. Epub 2010 Aug 30.
In this paper, we treat tracking as a learning problem of estimating the location and the scale of an object given its previous location, scale, as well as current and previous image frames. Given a set of examples, we train convolutional neural networks (CNNs) to perform the above estimation task. Different from other learning methods, the CNNs learn both spatial and temporal features jointly from image pairs of two adjacent frames. We introduce multiple path ways in CNN to better fuse local and global information. A creative shift-variant CNN architecture is designed so as to alleviate the drift problem when the distracting objects are similar to the target in cluttered environment. Furthermore, we employ CNNs to estimate the scale through the accurate localization of some key points. These techniques are object-independent so that the proposed method can be applied to track other types of object. The capability of the tracker of handling complex situations is demonstrated in many testing sequences.
在本文中,我们将跟踪视为一个学习问题,即根据物体的先前位置、尺度以及当前和先前的图像帧来估计其位置和尺度。给定一组示例,我们训练卷积神经网络(CNN)来执行上述估计任务。与其他学习方法不同,CNN从两个相邻帧的图像对中联合学习空间和时间特征。我们在CNN中引入多条路径,以更好地融合局部和全局信息。设计了一种创新的平移可变CNN架构,以减轻在杂乱环境中干扰物体与目标相似时的漂移问题。此外,我们使用CNN通过对一些关键点的精确定位来估计尺度。这些技术与物体无关,因此所提出的方法可应用于跟踪其他类型的物体。在许多测试序列中都展示了该跟踪器处理复杂情况的能力。