Wong Ting-Kam Leonard, Yang Jiaowen
Department of Statistical Sciences, University of Toronto, Toronto, Canada.
Facebook, Menlo Park, USA.
Inf Geom. 2022;5(1):131-159. doi: 10.1007/s41884-021-00053-7. Epub 2021 Jul 30.
Optimal transport and information geometry both study geometric structures on spaces of probability distributions. Optimal transport characterizes the cost-minimizing movement from one distribution to another, while information geometry originates from coordinate invariant properties of statistical inference. Their relations and applications in statistics and machine learning have started to gain more attention. In this paper we give a new differential-geometric relation between the two fields. Namely, the pseudo-Riemannian framework of Kim and McCann, which provides a geometric perspective on the fundamental Ma-Trudinger-Wang (MTW) condition in the regularity theory of optimal transport maps, encodes the dualistic structure of statistical manifold. This general relation is described using the framework of -divergence under which divergences are defined by optimal transport maps. As a by-product, we obtain a new information-geometric interpretation of the MTW tensor on the graph of the transport map. This relation sheds light on old and new aspects of information geometry. The dually flat geometry of Bregman divergence corresponds to the quadratic cost and the pseudo-Euclidean space, and the logarithmic -divergence introduced by Pal and the first author has constant sectional curvature in a sense to be made precise. In these cases we give a geometric interpretation of the information-geometric curvature in terms of the divergence between a primal-dual pair of geodesics.
最优传输和信息几何都研究概率分布空间上的几何结构。最优传输刻画了从一种分布到另一种分布的成本最小化移动,而信息几何源于统计推断的坐标不变性质。它们在统计学和机器学习中的关系及应用已开始受到更多关注。在本文中,我们给出了这两个领域之间一种新的微分几何关系。具体而言,Kim和McCann的伪黎曼框架,它为最优传输映射正则性理论中的基本马 - 特鲁丁格 - 王(MTW)条件提供了几何视角,编码了统计流形的二元结构。这种一般关系是使用(\alpha -)散度框架来描述的,在该框架下散度由最优传输映射定义。作为一个副产品,我们在传输映射的图上得到了MTW张量的一种新的信息几何解释。这种关系揭示了信息几何的新老方面。布雷格曼散度的对偶平坦几何对应于二次成本和伪欧几里得空间,并且Pal和第一作者引入的对数(\alpha -)散度在某种精确意义下具有常截面曲率。在这些情况下,我们根据一对原对偶测地线之间散度给出了信息几何曲率的几何解释。