Matuk James, Kurtek Sebastian, Bharath Karthik
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11035-11046. doi: 10.1109/TPAMI.2024.3451328. Epub 2024 Nov 6.
Topological data analysis provides a set of tools to uncover low-dimensional structure in noisy point clouds. Prominent amongst the tools is persistence homology, which summarizes birth-death times of homological features using data objects known as persistence diagrams. To better aid statistical analysis, a functional representation of the diagrams, known as persistence landscapes, enable use of functional data analysis and machine learning tools. Topological and geometric variabilities inherent in point clouds are confounded in both persistence diagrams and landscapes, and it is important to distinguish topological signal from noise to draw reliable conclusions on the structure of the point clouds when using persistence homology. We develop a framework for decomposing variability in persistence diagrams into topological signal and topological noise through alignment of persistence landscapes using an elastic Riemannian metric. Aligned landscapes (amplitude) isolate the topological signal. Reparameterizations used for landscape alignment (phase) are linked to a resolution parameter used to generate persistence diagrams, and capture topological noise in the form of geometric, global scaling and sampling variabilities. We illustrate the importance of decoupling topological signal and topological noise in persistence diagrams (landscapes) using several simulated examples. We also demonstrate that our approach provides novel insights in two real data studies.
拓扑数据分析提供了一组工具,用于在有噪声的点云中揭示低维结构。其中突出的工具是持久同调,它使用称为持久图的数据对象来总结同调特征的出生-死亡时间。为了更好地辅助统计分析,持久图的一种函数表示形式,即持久景观,使得能够使用函数数据分析和机器学习工具。点云中固有的拓扑和几何变异性在持久图和景观中都相互混淆,在使用持久同调时,区分拓扑信号和噪声对于得出关于点云结构的可靠结论很重要。我们开发了一个框架,通过使用弹性黎曼度量对齐持久景观,将持久图中的变异性分解为拓扑信号和拓扑噪声。对齐的景观(幅度)分离出拓扑信号。用于景观对齐的重新参数化(相位)与用于生成持久图的分辨率参数相关联,并以几何、全局缩放和采样变异性的形式捕获拓扑噪声。我们使用几个模拟示例说明了在持久图(景观)中解耦拓扑信号和拓扑噪声的重要性。我们还证明了我们的方法在两项实际数据研究中提供了新的见解。