Suppr超能文献

扩散地球移动距离与分布嵌入

Diffusion Earth Mover's Distance and Distribution Embeddings.

作者信息

Tong Alexander, Huguet Guillaume, Natik Amine, MacDonald Kincaid, Kuchroo Manik, Coifman Ronald, Wolf Guy, Krishnaswamy Smita

出版信息

ArXiv. 2021 Feb 25:arXiv:2102.12833v2.

Abstract

We propose a new fast method of measuring distances between large numbers of related high dimensional datasets called the Diffusion Earth Mover's Distance (EMD). We model the datasets as distributions supported on common data graph that is derived from the affinity matrix computed on the combined data. In such cases where the graph is a discretization of an underlying Riemannian closed manifold, we prove that Diffusion EMD is topologically equivalent to the standard EMD with a geodesic ground distance. Diffusion EMD can be computed in $\tilde{O}(n)$ time and is more accurate than similarly fast algorithms such as tree-based EMDs. We also show Diffusion EMD is fully differentiable, making it amenable to future uses in gradient-descent frameworks such as deep neural networks. Finally, we demonstrate an application of Diffusion EMD to single cell data collected from 210 COVID-19 patient samples at Yale New Haven Hospital. Here, Diffusion EMD can derive distances between patients on the manifold of cells at least two orders of magnitude faster than equally accurate methods. This distance matrix between patients can be embedded into a higher level patient manifold which uncovers structure and heterogeneity in patients. More generally, Diffusion EMD is applicable to all datasets that are massively collected in parallel in many medical and biological systems.

摘要

我们提出了一种新的快速方法,用于测量大量相关高维数据集之间的距离,称为扩散地球移动距离(EMD)。我们将数据集建模为支持在公共数据图上的分布,该公共数据图源自对组合数据计算的亲和矩阵。在图是底层黎曼闭合流形的离散化的情况下,我们证明扩散EMD在拓扑上等同于具有测地地面距离的标准EMD。扩散EMD可以在$\tilde{O}(n)$时间内计算,并且比类似的快速算法(如基于树的EMD)更准确。我们还表明扩散EMD是完全可微的,使其适用于未来在梯度下降框架(如深度神经网络)中的应用。最后,我们展示了扩散EMD在从耶鲁纽黑文医院收集的210个COVID-19患者样本的单细胞数据中的应用。在这里,扩散EMD可以在细胞流形上得出患者之间的距离,比同等准确的方法至少快两个数量级。患者之间的这个距离矩阵可以嵌入到更高层次的患者流形中,从而揭示患者的结构和异质性。更一般地说,扩散EMD适用于在许多医学和生物系统中并行大量收集的所有数据集。

相似文献

2
EMBEDDING SIGNALS ON GRAPHS WITH UNBALANCED DIFFUSION EARTH MOVER'S DISTANCE.使用非平衡扩散推土机距离在图上嵌入信号
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:5647-5651. doi: 10.1109/icassp43922.2022.9746556. Epub 2022 Apr 27.
4
Kernel earth mover's distance for EEG classification.基于核的地球移动距离的脑电分类。
Clin EEG Neurosci. 2013 Jul;44(3):182-7. doi: 10.1177/1550059412471521. Epub 2013 May 10.
6
On Markov Earth Mover's Distance.论马尔可夫推土机距离。
Int J Image Graph. 2014 Oct;14(4):1450016. doi: 10.1142/S0219467814500168.
7
H-EMD: A Hierarchical Earth Mover's Distance Method for Instance Segmentation.H-EMD:一种用于实例分割的分层地移动者距离方法。
IEEE Trans Med Imaging. 2022 Oct;41(10):2582-2597. doi: 10.1109/TMI.2022.3169449. Epub 2022 Sep 30.
9
Linearized multidimensional earth-mover's-distance gradient flows.线性化多维推土机距离梯度流。
IEEE Trans Image Process. 2013 Dec;22(12):5322-35. doi: 10.1109/TIP.2013.2279952.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验