Zhang Jun, Jiao Licheng, Ma Wenping, Liu Fang, Liu Xu, Li Lingling, Zhu Hao
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5669-5681. doi: 10.1109/TNNLS.2021.3130655. Epub 2023 Sep 1.
Local image descriptor learning has been instrumental in various computer vision tasks. Recent innovations lie with similarity measurement of descriptor vectors with metric learning for randomly selected Siamese or triplet patches. Local image descriptor learning focuses more on hard samples since easy samples do not contribute much to optimization. However, few studies focus on hard samples of image patches from the perspective of loss functions and design appropriate learning algorithms to obtain a more compact descriptor representation. This article proposes a regularized descriptor learning network (RDLNet) that makes the network focus on the learning of hard samples and compact descriptor with triplet networks. A novel hard sample mining strategy is designed to select the hardest negative samples in mini-batch. Then batch margin loss concerned with hard samples is adopted to optimize the distance of extreme cases. Finally, for a more stable network and preventing network collapsing, orthogonal regularization is designed to constrain convolutional kernels and obtain rich deep features. RDLNet provides a compact discriminative low-dimensional representation and can be embedded in other pipelines easily. This article gives extensive experimental results for large benchmarks in multiple scenarios and generalization in matching applications with significant improvements.
局部图像描述符学习在各种计算机视觉任务中发挥了重要作用。最近的创新在于使用度量学习对随机选择的连体或三元组补丁的描述符向量进行相似性测量。局部图像描述符学习更多地关注困难样本,因为简单样本对优化贡献不大。然而,很少有研究从损失函数的角度关注图像补丁的困难样本,并设计合适的学习算法以获得更紧凑的描述符表示。本文提出了一种正则化描述符学习网络(RDLNet),该网络通过三元组网络使网络专注于困难样本和紧凑描述符的学习。设计了一种新颖的困难样本挖掘策略,以在小批量中选择最难的负样本。然后采用与困难样本相关的批量边际损失来优化极端情况的距离。最后,为了使网络更稳定并防止网络崩溃,设计了正交正则化来约束卷积核并获得丰富的深度特征。RDLNet提供了紧凑的判别性低维表示,并且可以轻松嵌入到其他管道中。本文给出了在多种场景下针对大型基准测试的广泛实验结果,以及在匹配应用中的泛化能力,均有显著改进。