IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):276-290. doi: 10.1109/TPAMI.2018.2848925. Epub 2018 Jun 25.
Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate the task of training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble. These loss functions can be applied either for weight initialization or during training. Together, our contributions leverage large embedding sizes more effectively by significantly reducing correlation of the embedding and consequently increase retrieval accuracy of the embedding. Our method works with any differentiable loss function and does not introduce any additional parameters during test time. We evaluate our metric learning method on image retrieval tasks and show that it improves over state-of-the-art methods on the CUB-200-2011, Cars-196, Stanford Online Products, In-Shop Clothes Retrieval and VehicleID datasets. Therefore, our findings suggest that by dividing deep networks at the end into several smaller and diverse networks, we can significantly reduce overfitting.
通过深度神经网络学习图像对之间的相似性函数可以得到高度相关的嵌入激活。在这项工作中,我们展示了如何通过利用集合内的独立性来提高这种嵌入的鲁棒性。为此,我们将深度网络的最后一个嵌入层划分为嵌入集合,并将训练这个集合的任务表述为一个在线梯度提升问题。每个学习者从前一个学习者那里获得重新加权的训练样本。此外,我们提出了两种损失函数,这些损失函数可以增加我们集合中的多样性。这些损失函数可以用于权重初始化或训练过程中。总之,我们的贡献通过显著降低嵌入的相关性,更有效地利用了较大的嵌入大小,从而提高了嵌入的检索准确性。我们的方法可以与任何可微分的损失函数一起使用,并且在测试时不会引入任何额外的参数。我们在图像检索任务上评估了我们的度量学习方法,结果表明,它在 CUB-200-2011、Cars-196、斯坦福在线产品、店内服装检索和 VehicleID 数据集上优于最先进的方法。因此,我们的发现表明,通过在网络末端将深度网络划分为几个较小且多样化的网络,我们可以显著减少过拟合。