Zhang Wei, Shih Yi-Hsuan, Li Jr-Shin
Department of Electrical & Systems Engineering, Washington University in St. Louis, One Brookings Drive, St. Louis, MO 63130, USA.
Division of Computational & Data Sciences, Washington University in St. Louis, One Brookings Drive, St. Louis, MO 63130, USA.
PNAS Nexus. 2024 Nov 28;3(12):pgae530. doi: 10.1093/pnasnexus/pgae530. eCollection 2024 Dec.
Learning global structures, i.e. topological properties, inherent in complex data is an essential yet challenging task that spans across various scientific and engineering disciplines. A fundamental approach is to extract local data representations and use them to assemble the global structure. This conjunction of local and global operations catalyzes the integration of tools from algebraic and computational topology with machine learning. In this article, we propose a hierarchical simplicial manifold learning algorithm, constituted by nested clustering and topological reduction, for constructing simplicial complexes and decoding their topological properties. We show that the learned complex possesses the same topology as the original embedding manifold from which the data were sampled. We demonstrate applicability, convergence, and computational efficiency of the algorithm on both synthetic and real-world data.
学习复杂数据中固有的全局结构,即拓扑性质,是一项跨越各种科学和工程学科的重要但具有挑战性的任务。一种基本方法是提取局部数据表示并使用它们来组装全局结构。这种局部和全局操作的结合促进了代数和计算拓扑工具与机器学习的整合。在本文中,我们提出了一种分层单纯形流形学习算法,该算法由嵌套聚类和拓扑约简构成,用于构建单纯复形并解码其拓扑性质。我们表明,学习到的复形与从中采样数据的原始嵌入流形具有相同的拓扑结构。我们在合成数据和真实世界数据上证明了该算法的适用性、收敛性和计算效率。