Lin Chi-Heng, Azabou Mehdi, Dyer Eva L
Department of Electrical and Computer Engineering, Georgia Tech, Atlanta, Georgia, USA.
Machine Learning Program, Georgia Tech, Atlanta, Georgia, USA.
Proc Mach Learn Res. 2021 Jul;139:6631-6641.
Optimal transport (OT) is a widely used technique for distribution alignment, with applications throughout the machine learning, graphics, and vision communities. Without any additional structural assumptions on transport, however, OT can be fragile to outliers or noise, especially in high dimensions. Here, we introduce Latent Optimal Transport (LOT), a new approach for OT that simultaneously learns low-dimensional structure in data while leveraging this structure to solve the alignment task. The idea behind our approach is to learn two sets of "anchors" that constrain the flow of transport between a source and target distribution. In both theoretical and empirical studies, we show that LOT regularizes the rank of transport and makes it more robust to outliers and the sampling density. We show that by allowing the source and target to have different anchors, and using LOT to align the latent spaces between anchors, the resulting transport plan has better structural interpretability and highlights connections between both the individual data points and the local geometry of the datasets.
最优传输(OT)是一种广泛用于分布对齐的技术,在机器学习、图形学和视觉领域都有应用。然而,在没有对传输进行任何额外结构假设的情况下,OT可能对异常值或噪声很敏感,尤其是在高维空间中。在此,我们引入潜在最优传输(LOT),这是一种新的OT方法,它在学习数据中的低维结构的同时,利用这种结构来解决对齐任务。我们方法背后的想法是学习两组“锚点”,它们约束源分布和目标分布之间的传输流。在理论和实证研究中,我们表明LOT规范了传输的秩,并使其对异常值和采样密度更具鲁棒性。我们表明,通过允许源和目标具有不同的锚点,并使用LOT来对齐锚点之间的潜在空间,所得的传输计划具有更好的结构可解释性,并突出了各个数据点之间以及数据集局部几何之间的联系。