Dong Xingping, Shen Jianbing, Wu Dongming, Guo Kan, Jin Xiaogang, Porikli Fatih
IEEE Trans Image Process. 2019 Jul;28(7):3516-3527. doi: 10.1109/TIP.2019.2898567. Epub 2019 Feb 11.
In the same vein of discriminative one-shot learning, Siamese networks allow recognizing an object from a single exemplar with the same class label. However, they do not take advantage of the underlying structure of the data and the relationship among the multitude of samples as they only rely on the pairs of instances for training. In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation. We design a shared network with four branches that receive a multi-tuple of instances as inputs and are connected by a novel loss function consisting of pair loss and triplet loss. According to the similarity metric, we select the most similar and the most dissimilar instances as the positive and negative inputs of triplet loss from each multi-tuple. We show that this scheme improves the training performance. Furthermore, we introduce a new weight layer to automatically select suitable combination weights, which will avoid the conflict between triplet and pair loss leading to worse performance. We evaluate our quadruplet framework by model-free tracking-by-detection of objects from a single initial exemplar in several visual object tracking benchmarks. Our extensive experimental analysis demonstrates that our tracker achieves superior performance with a real-time processing speed of 78 frames/s. Our source code is available.
基于判别式单样本学习的思路,孪生网络能够根据单个具有相同类别标签的样本识别对象。然而,由于它们仅依赖实例对进行训练,所以没有利用数据的底层结构以及众多样本之间的关系。在本文中,我们提出一种新的四元组深度网络,以研究训练实例之间的潜在联系,旨在实现更强大的表征。我们设计了一个具有四个分支的共享网络,该网络接收多个实例作为输入,并通过由成对损失和三元组损失组成的新型损失函数进行连接。根据相似性度量,我们从每个多实例组中选择最相似和最不相似的实例作为三元组损失的正输入和负输入。我们证明了该方案提高了训练性能。此外,我们引入了一个新的权重层来自动选择合适的组合权重,这将避免三元组损失和成对损失之间的冲突导致性能变差。我们通过在几个视觉目标跟踪基准中从单个初始样本进行无模型检测跟踪来评估我们的四元组框架。我们广泛的实验分析表明,我们的跟踪器以78帧/秒的实时处理速度实现了卓越的性能。我们的源代码是可用的。