Li Chenglong, Lin Liang, Zuo Wangmeng, Tang Jin, Yang Ming-Hsuan
IEEE Trans Pattern Anal Mach Intell. 2019 Nov;41(11):2770-2782. doi: 10.1109/TPAMI.2018.2864965. Epub 2018 Aug 13.
Existing visual tracking methods usually localize a target object with a bounding box, in which the performance of the foreground object trackers or detectors is often affected by the inclusion of background clutter. To handle this problem, we learn a patch-based graph representation for visual tracking. The tracked object is modeled by with a graph by taking a set of non-overlapping image patches as nodes, in which the weight of each node indicates how likely it belongs to the foreground and edges are weighted for indicating the appearance compatibility of two neighboring nodes. This graph is dynamically learned and applied in object tracking and model updating. During the tracking process, the proposed algorithm performs three main steps in each frame. First, the graph is initialized by assigning binary weights of some image patches to indicate the object and background patches according to the predicted bounding box. Second, the graph is optimized to refine the patch weights by using a novel alternating direction method of multipliers. Third, the object feature representation is updated by imposing the weights of patches on the extracted image features. The object location is predicted by maximizing the classification score in the structured support vector machine. Extensive experiments show that the proposed tracking algorithm performs well against the state-of-the-art methods on large-scale benchmark datasets.
现有的视觉跟踪方法通常使用边界框来定位目标对象,其中前景对象跟踪器或检测器的性能常常受到背景杂波的影响。为了解决这个问题,我们学习一种基于图像块的图表示用于视觉跟踪。通过将一组非重叠图像块作为节点来用图对被跟踪对象进行建模,其中每个节点的权重表示其属于前景的可能性,并且边被赋予权重以表示两个相邻节点的外观兼容性。这个图是动态学习的,并应用于对象跟踪和模型更新。在跟踪过程中,所提出的算法在每一帧执行三个主要步骤。首先,根据预测的边界框,通过为一些图像块分配二进制权重来初始化图,以指示对象和背景块。其次,使用一种新颖的乘子交替方向法对图进行优化,以细化图像块权重。第三,通过将图像块的权重施加到提取的图像特征上来更新对象特征表示。通过在结构化支持向量机中最大化分类分数来预测对象位置。大量实验表明,所提出的跟踪算法在大规模基准数据集上与现有最先进方法相比表现良好。