Nielsen Frank
Sony Computer Science Laboratories, Tokyo 141-0022, Japan.
Entropy (Basel). 2024 Feb 23;26(3):193. doi: 10.3390/e26030193.
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the cumulant and partition functions are strictly convex and smooth functions inducing corresponding pairs of Bregman and Jensen divergences. It is well known that skewed Bhattacharyya distances between the probability densities of an exponential family amount to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and that in limit cases the sided Kullback-Leibler divergences amount to reverse-sided Bregman divergences. In this work, we first show that the α-divergences between non-normalized densities of an exponential family amount to scaled α-skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetical means allows both convex functions and their arguments to be deformed, thereby defining dually flat spaces with corresponding divergences when ordinary convexity is preserved.
指数族是统计模型,在统计学、信息论和机器学习等领域起着重要作用。指数族既可以通过其累积量或自由能函数进行减法归一化,也可以等效地通过其配分函数进行除法归一化。累积量函数和配分函数都是严格凸且光滑的函数,它们分别诱导出相应的Bregman散度和Jensen散度对。众所周知,指数族概率密度之间的倾斜Bhattacharyya距离等于其相应自然参数之间由累积量函数诱导的倾斜Jensen散度,并且在极限情况下,单侧Kullback-Leibler散度等于反向的Bregman散度。在这项工作中,我们首先表明指数族未归一化密度之间的α散度等于由配分函数诱导的缩放α倾斜Jensen散度。然后我们展示了相对于一对拟算术均值的比较凸性如何允许凸函数及其自变量都发生变形,从而在保持普通凸性时定义具有相应散度的对偶平坦空间。