Song Kun, Han Junwei, Cheng Gong, Lu Jiwen, Nie Feiping
IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4591-4604. doi: 10.1109/TPAMI.2021.3073587. Epub 2022 Aug 4.
In this paper, we reveal that metric learning would suffer from serious inseparable problem if without informative sample mining. Since the inseparable samples are often mixed with hard samples, current informative sample mining strategies used to deal with inseparable problem may bring up some side-effects, such as instability of objective function, etc. To alleviate this problem, we propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML). In ANML, we design two thresholds to adaptively identify the inseparable similar and dissimilar samples in the training procedure, thus inseparable sample removing and metric parameter learning are implemented in the same procedure. Due to the non-continuity of the proposed ANML, we develop an ingenious function, named log-exp mean function to construct a continuous formulation to surrogate it, which can be efficiently solved by the gradient descent method. Similar to Triplet loss, ANML can be used to learn both the linear and deep embeddings. By analyzing the proposed method, we find it has some interesting properties. For example, when ANML is used to learn the linear embedding, current famous metric learning algorithms such as the large margin nearest neighbor (LMNN) and neighbourhood components analysis (NCA) are the special cases of the proposed ANML by setting the parameters different values. When it is used to learn deep features, the state-of-the-art deep metric learning algorithms such as Triplet loss, Lifted structure loss, and Multi-similarity loss become the special cases of ANML. Furthermore, the log-exp mean function proposed in our method gives a new perspective to review the deep metric learning methods such as Prox-NCA and N-pairs loss. At last, promising experimental results demonstrate the effectiveness of the proposed method.
在本文中,我们揭示了如果没有信息性样本挖掘,度量学习将遭受严重的不可分问题。由于不可分样本常常与难样本混合在一起,当前用于处理不可分问题的信息性样本挖掘策略可能会带来一些副作用,比如目标函数的不稳定性等。为了缓解这个问题,我们提出了一种新颖的距离度量学习算法,称为自适应邻域度量学习(ANML)。在ANML中,我们设计了两个阈值,以便在训练过程中自适应地识别不可分的相似和不相似样本,从而在同一过程中实现不可分样本去除和度量参数学习。由于所提出的ANML的非连续性,我们开发了一个巧妙的函数,称为对数-指数均值函数,以构建一个连续的公式来替代它,该公式可以通过梯度下降法有效地求解。与三元组损失类似,ANML可用于学习线性和深度嵌入。通过对所提出方法的分析,我们发现它具有一些有趣的性质。例如,当ANML用于学习线性嵌入时,当前著名的度量学习算法,如大间隔最近邻(LMNN)和邻域成分分析(NCA),通过设置不同的参数值,是所提出的ANML的特殊情况。当它用于学习深度特征时,诸如三元组损失、提升结构损失和多相似性损失等当前最先进的深度度量学习算法成为ANML的特殊情况。此外,我们方法中提出的对数-指数均值函数为审视诸如Prox-NCA和N对损失等深度度量学习方法提供了一个新的视角。最后,有前景的实验结果证明了所提出方法的有效性。