Pont Mathieu, Tierny Julien
IEEE Trans Vis Comput Graph. 2024 Sep;30(9):6390-6406. doi: 10.1109/TVCG.2023.3334755. Epub 2024 Jul 31.
This article presents a computational framework for the Wasserstein auto-encoding of merge trees (MT-WAE), a novel extension of the classical auto-encoder neural network architecture to the Wasserstein metric space of merge trees. In contrast to traditional auto-encoders which operate on vectorized data, our formulation explicitly manipulates merge trees on their associated metric space at each layer of the network, resulting in superior accuracy and interpretability. Our novel neural network approach can be interpreted as a non-linear generalization of previous linear attempts (Pont et al. 2023) at merge tree encoding. It also trivially extends to persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our algorithms, with MT-WAE computations in the orders of minutes on average. We show the utility of our contributions in two applications adapted from previous work on merge tree encoding (Pont et al. 2023). First, we apply MT-WAE to merge tree compression, by concisely representing them with their coordinates in the final layer of our auto-encoder. Second, we document an application to dimensionality reduction, by exploiting the latent space of our auto-encoder, for the visual analysis of ensemble data. We illustrate the versatility of our framework by introducing two penalty terms, to help preserve in the latent space both the Wasserstein distances between merge trees, as well as their clusters. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used for reproducibility.
本文提出了一种用于合并树的瓦瑟斯坦自动编码(MT-WAE)的计算框架,这是经典自动编码器神经网络架构在合并树瓦瑟斯坦度量空间上的一种新颖扩展。与在矢量化数据上运行的传统自动编码器不同,我们的公式在网络的每一层都明确地在合并树的相关度量空间上对其进行操作,从而带来更高的准确性和可解释性。我们新颖的神经网络方法可以被解释为先前合并树编码线性尝试(庞特等人,2023年)的非线性推广。它也可以轻松扩展到持久图。在公共数据集上进行的大量实验证明了我们算法的效率,MT-WAE计算平均只需几分钟。我们在从先前关于合并树编码的工作(庞特等人,2023年)改编的两个应用中展示了我们贡献的实用性。首先,我们将MT-WAE应用于合并树压缩,通过在自动编码器的最后一层用其坐标简洁地表示它们。其次,我们记录了一个降维应用,通过利用自动编码器的潜在空间来对数据集进行可视化分析。我们通过引入两个惩罚项来说明我们框架的通用性,以帮助在潜在空间中保留合并树之间的瓦瑟斯坦距离以及它们的聚类。在这两个应用中,定量实验评估了我们框架的相关性。最后,我们提供了一个可用于重现性的C++实现。