Utah State University, Logan, UT, United States.
Utah State University, Logan, UT, United States.
Neural Netw. 2020 Sep;129:334-343. doi: 10.1016/j.neunet.2020.06.011. Epub 2020 Jun 17.
Visual trackers using deep neural networks have demonstrated favorable performance in object tracking. However, training a deep classification network using overlapped initial target regions may lead an overfitted model. To increase the model generalization, we propose an appearance variation adaptation (AVA) tracker that aligns the feature distributions of target regions over time by learning an adaptation mask in an adversarial network. The proposed adversarial network consists of a generator and a discriminator network that compete with each other over optimizing a discriminator loss in a mini-max optimization problem. Specifically, the discriminator network aims to distinguish recent target regions from earlier ones by minimizing the discriminator loss, while the generator network aims to produce an adaptation mask to maximize the discriminator loss. We incorporate a gradient reverse layer in the adversarial network to solve the aforementioned mini-max optimization in an end-to-end manner. We compare the performance of the proposed AVA tracker with the most recent state-of-the-art trackers by doing extensive experiments on OTB50, OTB100, and VOT2016 tracking benchmarks. Among the compared methods, AVA yields the highest area under curve (AUC) score of 0.712 and the highest average precision score of 0.951 on the OTB50 tracking benchmark. It achieves the second best AUC score of 0.688 and the best precision score of 0.924 on the OTB100 tracking benchmark. AVA also achieves the second best expected average overlap (EAO) score of 0.366, the best failure rate of 0.68, and the second best accuracy of 0.53 on the VOT2016 tracking benchmark.
基于深度学习的视觉跟踪器在目标跟踪中表现出了优异的性能。然而,在重叠的初始目标区域上训练深度分类网络可能会导致模型过拟合。为了提高模型的泛化能力,我们提出了一种外观变化适应(AVA)跟踪器,通过在对抗网络中学习自适应掩模来对齐目标区域的特征分布随时间的变化。所提出的对抗网络由生成器和判别器网络组成,它们通过在最小-最大优化问题中优化判别器损失来相互竞争。具体来说,判别器网络旨在通过最小化判别器损失来区分最近的目标区域和较早的目标区域,而生成器网络旨在通过生成自适应掩模来最大化判别器损失。我们在对抗网络中引入了一个梯度反转层,以便以端到端的方式解决上述最小-最大优化问题。我们通过在 OTB50、OTB100 和 VOT2016 跟踪基准上进行广泛的实验,将所提出的 AVA 跟踪器的性能与最近的最先进的跟踪器进行了比较。在所比较的方法中,AVA 在 OTB50 跟踪基准上的曲线下面积(AUC)得分最高,为 0.712,平均精度得分最高,为 0.951。它在 OTB100 跟踪基准上的 AUC 得分排名第二,为 0.688,精度得分排名第一,为 0.924。AVA 在 VOT2016 跟踪基准上的期望平均重叠(EAO)得分排名第二,为 0.366,失败率最低,为 0.68,精度得分排名第二,为 0.53。