Li Tao, Meng Cheng, Xu Hongteng, Zhu Jun
IEEE Trans Neural Netw Learn Syst. 2025 Sep;36(9):16814-16824. doi: 10.1109/TNNLS.2025.3551275.
Hyperbolic spaces have been considered pervasively for embedding hierarchically structured data in the recent decade. However, there is a lack of studies focusing on efficient distance metrics for comparing probability distributions in hyperbolic spaces. To bridge the gap, we propose a novel metric called the hyperbolic space-filling curve projection Wasserstein (SFW) distance. The idea is to first project two probability distributions onto a space-filling curve to obtain a closed-form coupling between them and then calculate the transport distance between these two distributions in the hyperbolic space accordingly. Theoretically, we show the SFW distance is a proper metric and is well-defined for probability measures with bounded supports. Statistical convergence rates for the proposed estimator are provided as well. Moreover, we propose two variants of the SFW distance based on geodesic and horospherical projections, respectively, to combat the curse-of-dimensionality. Empirical results on synthetic and real-world data indicate that the SFW distance can effectively serve as a surrogate of the popular Wasserstein distance with low complexity.
在最近十年中,双曲空间已被广泛用于嵌入层次结构化数据。然而,缺乏专注于双曲空间中比较概率分布的有效距离度量的研究。为了弥补这一差距,我们提出了一种名为双曲空间填充曲线投影瓦瑟斯坦(SFW)距离的新度量。其思路是首先将两个概率分布投影到一条空间填充曲线上,以获得它们之间的闭式耦合,然后相应地计算双曲空间中这两个分布之间的传输距离。从理论上讲,我们证明了SFW距离是一个恰当的度量,并且对于具有有界支撑的概率测度是定义良好的。还提供了所提出估计量的统计收敛速率。此外,我们分别基于测地线投影和水平球面投影提出了SFW距离的两个变体,以应对维数灾难。在合成数据和真实世界数据上的实证结果表明,SFW距离可以有效地作为具有低复杂度的流行瓦瑟斯坦距离的替代。