Connor Richard, Dearle Alan, Claydon Ben, Vadicamo Lucia
School of Computer Science, University of St Andrews, St Andrews KY16 9SS, UK.
Institute of Information Science and Technologies, Italian National Research Council (CNR), 56124 Pisa, Italy.
Entropy (Basel). 2024 Jun 3;26(6):491. doi: 10.3390/e26060491.
Cross-entropy loss is crucial in training many deep neural networks. In this context, we show a number of novel and strong correlations among various related divergence functions. In particular, we demonstrate that, in some circumstances, (a) cross-entropy is almost perfectly correlated with the little-known triangular divergence, and (b) cross-entropy is strongly correlated with the Euclidean distance over the logits from which the softmax is derived. The consequences of these observations are as follows. First, triangular divergence may be used as a cheaper alternative to cross-entropy. Second, logits can be used as features in a Euclidean space which is strongly synergistic with the classification process. This justifies the use of Euclidean distance over logits as a measure of similarity, in cases where the network is trained using softmax and cross-entropy. We establish these correlations via empirical observation, supported by a mathematical explanation encompassing a number of strongly related divergence functions.
交叉熵损失在训练许多深度神经网络中至关重要。在此背景下,我们展示了各种相关散度函数之间的一些新颖且强的相关性。特别地,我们证明,在某些情况下,(a)交叉熵与鲜为人知的三角散度几乎完全相关,并且(b)交叉熵与从其导出softmax的对数its上的欧几里得距离强烈相关。这些观察结果的后果如下。首先,三角散度可以用作交叉熵的更廉价替代方案。其次,对数its可以用作欧几里得空间中的特征,该空间与分类过程具有很强的协同作用。这证明了在使用softmax和交叉熵训练网络的情况下,使用对数its上的欧几里得距离作为相似性度量是合理的。我们通过实证观察建立这些相关性,并得到包含许多强相关散度函数的数学解释的支持。