IEEE Trans Neural Netw Learn Syst. 2015 Dec;26(12):3045-59. doi: 10.1109/TNNLS.2015.2401834. Epub 2015 Mar 18.
Object tracking is an important step in many artificial vision tasks. The current state-of-the-art implementations remain too computationally demanding for the problem to be solved in real time with high dynamics. This paper presents a novel real-time method for visual part-based tracking of complex objects from the output of an asynchronous event-based camera. This paper extends the pictorial structures model introduced by Fischler and Elschlager 40 years ago and introduces a new formulation of the problem, allowing the dynamic processing of visual input in real time at high temporal resolution using a conventional PC. It relies on the concept of representing an object as a set of basic elements linked by springs. These basic elements consist of simple trackers capable of successfully tracking a target with an ellipse-like shape at several kilohertz on a conventional computer. For each incoming event, the method updates the elastic connections established between the trackers and guarantees a desired geometric structure corresponding to the tracked object in real time. This introduces a high temporal elasticity to adapt to projective deformations of the tracked object in the focal plane. The elastic energy of this virtual mechanical system provides a quality criterion for tracking and can be used to determine whether the measured deformations are caused by the perspective projection of the perceived object or by occlusions. Experiments on real-world data show the robustness of the method in the context of dynamic face tracking.
目标跟踪是许多人工智能视觉任务中的重要步骤。当前最先进的实现方法仍然过于计算密集,无法在具有高动态性的情况下实时解决问题。本文提出了一种新颖的实时方法,用于从异步事件相机的输出中对复杂物体进行基于视觉的部分跟踪。本文扩展了 Fischler 和 Elschlager 40 年前提出的图像结构模型,并提出了一个新的问题表述,允许使用传统 PC 以高时间分辨率实时动态处理视觉输入。它依赖于将物体表示为由弹簧连接的基本元素集的概念。这些基本元素由简单的跟踪器组成,能够在传统计算机上以每秒数千赫兹的速度成功跟踪具有椭圆形形状的目标。对于每个传入的事件,该方法更新跟踪器之间建立的弹性连接,并实时保证与跟踪对象对应的期望几何结构。这为适应焦平面中跟踪对象的投影变形引入了高时间弹性。这个虚拟机械系统的弹性能量为跟踪提供了一个质量标准,并可用于确定所测量的变形是由感知对象的透视投影还是遮挡引起的。对真实数据的实验表明,该方法在动态人脸跟踪的情况下具有鲁棒性。