Dipartimento di Ingegneria dell'Informazione, University of Siena, Via Roma 56, 53100, Siena, Italy.
Neural Netw. 2012 Feb;26:141-58. doi: 10.1016/j.neunet.2011.10.009. Epub 2011 Oct 25.
In this paper we present Similarity Neural Networks (SNNs), a neural network model able to learn a similarity measure for pairs of patterns, exploiting a binary supervision on their similarity/dissimilarity relationships. Pairwise relationships, also referred to as pairwise constraints, generally contain less information than class labels, but, in some contexts, are easier to obtain from human supervisors. The SNN architecture guarantees the basic properties of a similarity measure (symmetry and non negativity) and it can deal with non-transitivity of the similarity criterion. Unlike the majority of the metric learning algorithms proposed so far, it can model non-linear relationships among data still providing a natural out-of-sample extension to novel pairs of patterns. The theoretical properties of SNNs and their application to Semi-Supervised Clustering are investigated. In particular, we introduce a novel technique that allows the clustering algorithm to compute the optimal representatives of a data partition by means of backpropagation on the input layer, biased by a L(2) norm regularizer. An extensive set of experimental results are provided to compare SNNs with the most popular similarity learning algorithms. Both on benchmarks and real world data, SNNs and SNN-based clustering show improved performances, assessing the advantage of the proposed neural network approach to similarity measure learning.
本文提出了相似性神经网络(SNN),这是一种能够学习模式对之间相似度度量的神经网络模型,利用对其相似性/相异性关系的二进制监督。成对关系,也称为成对约束,通常比类别标签包含的信息量少,但在某些情况下,更容易从人类监督者那里获得。SNN 架构保证了相似度度量的基本属性(对称性和非负性),并且可以处理相似度准则的非传递性。与迄今为止提出的大多数度量学习算法不同,它可以对数据之间的非线性关系进行建模,同时仍然为新的模式对提供自然的样本外扩展。本文研究了 SNN 的理论性质及其在半监督聚类中的应用。特别是,我们引入了一种新的技术,允许聚类算法通过在输入层上进行反向传播,以 L(2)范数正则化器为偏置,来计算数据分区的最佳代表。提供了大量的实验结果来比较 SNN 与最流行的相似度学习算法。无论是在基准测试还是真实世界的数据上,SNN 和基于 SNN 的聚类都表现出了更好的性能,证明了所提出的基于神经网络的相似度度量学习方法的优势。