Wang Rui, Zhao Rundong, Ribando-Gros Emily, Chen Jiahui, Tong Yiying, Wei Guo-Wei
Department of Mathematics, Michigan State University, MI 48824, USA.
Department of Computer Science and Engineering, Michigan State University, MI 48824, USA.
Found Data Sci. 2021 Mar;3(1):67-97. doi: 10.3934/fods.2021006.
Persistent homology (PH) is one of the most popular tools in topological data analysis (TDA), while graph theory has had a significant impact on data science. Our earlier work introduced the persistent spectral graph (PSG) theory as a unified multiscale paradigm to encompass TDA and geometric analysis. In PSG theory, families of persistent Laplacian matrices (PLMs) corresponding to various topological dimensions are constructed via a filtration to sample a given dataset at multiple scales. The harmonic spectra from the null spaces of PLMs offer the same topological invariants, namely persistent Betti numbers, at various dimensions as those provided by PH, while the non-harmonic spectra of PLMs give rise to additional geometric analysis of the shape of the data. In this work, we develop an open-source software package, called highly efficient robust multidimensional evolutionary spectra (HERMES), to enable broad applications of PSGs in science, engineering, and technology. To ensure the reliability and robustness of HERMES, we have validated the software with simple geometric shapes and complex datasets from three-dimensional (3D) protein structures. We found that the smallest non-zero eigenvalues are very sensitive to data abnormality.
持久同调(PH)是拓扑数据分析(TDA)中最受欢迎的工具之一,而图论对数据科学产生了重大影响。我们早期的工作引入了持久谱图(PSG)理论,作为一种统一的多尺度范式,以涵盖TDA和几何分析。在PSG理论中,通过过滤来构建对应于各种拓扑维度的持久拉普拉斯矩阵(PLM)族,以便在多个尺度上对给定数据集进行采样。PLM零空间的调和谱在各个维度上提供与PH相同的拓扑不变量,即持久贝蒂数,而PLM的非调和谱则对数据形状进行额外的几何分析。在这项工作中,我们开发了一个名为高效稳健多维演化谱(HERMES)的开源软件包,以实现PSG在科学、工程和技术中的广泛应用。为确保HERMES的可靠性和稳健性,我们使用简单几何形状和来自三维(3D)蛋白质结构的复杂数据集对该软件进行了验证。我们发现最小的非零特征值对数据异常非常敏感。