Ye Xulun, Zhao Jieyu, Zhang Long, Guo Lijun
IEEE Trans Cybern. 2019 Jul;49(7):2664-2677. doi: 10.1109/TCYB.2018.2832171. Epub 2018 May 16.
Multimanifold clustering separates data points approximately lying on a union of submanifolds into several clusters. In this paper, we propose a new nonparametric Bayesian model to handle the manifold data structure. In our framework, we first model the manifold mapping function between Euclidean space and topological space by applying a deep neural network, and then construct the corresponding generation process of multiple manifold data. To solve the posterior approximation problem, in the optimization procedure, we apply a variational auto-encoder-based optimization algorithm. Especially, as the manifold algorithm has poor performance on the real dataset where nonmanifold and manifold clusters are appearing simultaneously, we expand our proposed manifold algorithm by integrating it with the original Dirichlet process mixture model. Experimental results have been carried out to demonstrate the state-of-the-art clustering performance.
多流形聚类将大致位于子流形并集上的数据点分离为多个簇。在本文中,我们提出了一种新的非参数贝叶斯模型来处理流形数据结构。在我们的框架中,我们首先通过应用深度神经网络对欧几里得空间和拓扑空间之间的流形映射函数进行建模,然后构建多个流形数据的相应生成过程。为了解决后验近似问题,在优化过程中,我们应用基于变分自编码器的优化算法。特别是,由于流形算法在非流形和流形簇同时出现的真实数据集上性能较差,我们通过将其与原始狄利克雷过程混合模型集成来扩展我们提出的流形算法。已进行实验结果以证明其具有领先的聚类性能。