Faculty of Science, Braamfontein Campus, School of Computer Science and Applied Mathematics, University of Witwatersrand, Johannesburg 2000, South Africa.
Auckland Park Campus, Institute for Intelligent Systems, University of Johannesburg, Johannesburg 2006, South Africa.
Sensors (Basel). 2021 Sep 12;21(18):6109. doi: 10.3390/s21186109.
Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or datasets in which few class labels exist per class such as wildlife re-identification. Improving the performance of similarity learning models comes with developing new sampling techniques and designing loss functions better suited to training similarity in neural networks. However, the impact of these advances is tested on larger datasets, with limited attention given to smaller imbalanced datasets such as those found in unique wildlife re-identification. To this end, we test the advances in loss functions for similarity learning on several animal re-identification tasks. We add two new public datasets, Nyala and Lions, to the challenge of animal re-identification. Our results are state of the art on all public datasets tested except Pandas. The achieved Top-1 Recall is 94.8% on the Zebra dataset, 72.3% on the Nyala dataset, 79.7% on the Chimps dataset and, on the Tiger dataset, it is 88.9%. For the Lion dataset, we set a new benchmark at 94.8%. We find that the best performing loss function across all datasets is generally the triplet loss; however, there is only a marginal improvement compared to the performance achieved by Proxy-NCA models. We demonstrate that no single neural network architecture combined with a loss function is best suited for all datasets, although VGG-11 may be the most robust first choice. Our results highlight the need for broader experimentation and exploration of loss functions and neural network architecture for the more challenging task, over classical benchmarks, of wildlife re-identification.
使用深度卷积神经网络的相似性学习已被广泛应用于解决计算机视觉问题。这种吸引力是由其在单样本和零样本分类应用中的成功支持的。相似性学习的进展对于较小的数据集或每个类存在少数类别标签的数据集(如野生动物重新识别)至关重要。提高相似性学习模型的性能需要开发新的采样技术,并设计更适合神经网络训练相似性的损失函数。然而,这些进展的影响是在更大的数据集上进行测试的,而对于较小的不平衡数据集(如在独特的野生动物重新识别中发现的数据集)则关注有限。为此,我们在几个动物重新识别任务上测试了相似性学习的损失函数的进展。我们为动物重新识别挑战添加了两个新的公共数据集,Nyala 和 Lions。我们的结果在所有测试的公共数据集上都是最新的,除了 Pandas。在斑马数据集上,实现的最佳 Top-1 召回率为 94.8%,在 Nyala 数据集上为 72.3%,在 Chimps 数据集上为 79.7%,在 Tiger 数据集上为 88.9%。对于 Lion 数据集,我们设定了新的基准,为 94.8%。我们发现,在所有数据集上表现最好的损失函数通常是三元组损失;然而,与 Proxy-NCA 模型的性能相比,只有微小的改进。我们证明,没有一种神经网络架构与损失函数结合最适合所有数据集,尽管 VGG-11 可能是最稳健的首选。我们的结果强调了在更具挑战性的任务(如野生动物重新识别)中,需要更广泛地进行实验和探索损失函数和神经网络架构,而不仅仅是经典基准。