Fujita Kazuhisa
Komatsu University, Komatsu, Ishikawa, Japan.
University of Electro-Communications, Chofu, Tokyo, Japan.
PeerJ Comput Sci. 2021 Aug 20;7:e679. doi: 10.7717/peerj-cs.679. eCollection 2021.
Spectral clustering (SC) is one of the most popular clustering methods and often outperforms traditional clustering methods. SC uses the eigenvectors of a Laplacian matrix calculated from a similarity matrix of a dataset. SC has serious drawbacks: the significant increases in the time complexity derived from the computation of eigenvectors and the memory space complexity to store the similarity matrix. To address the issues, I develop a new approximate spectral clustering using the network generated by growing neural gas (GNG), called ASC with GNG in this study. ASC with GNG uses not only reference vectors for vector quantization but also the topology of the network for extraction of the topological relationship between data points in a dataset. ASC with GNG calculates the similarity matrix from both the reference vectors and the topology of the network generated by GNG. Using the network generated from a dataset by GNG, ASC with GNG achieves to reduce the computational and space complexities and improve clustering quality. In this study, I demonstrate that ASC with GNG effectively reduces the computational time. Moreover, this study shows that ASC with GNG provides equal to or better clustering performance than SC.
谱聚类(SC)是最流行的聚类方法之一,并且常常优于传统聚类方法。SC使用从数据集的相似性矩阵计算得出的拉普拉斯矩阵的特征向量。SC存在严重缺点:特征向量计算导致时间复杂度显著增加,以及存储相似性矩阵的内存空间复杂度。为了解决这些问题,我开发了一种新的近似谱聚类方法,它使用通过生长神经气体(GNG)生成的网络,在本研究中称为带GNG的ASC。带GNG的ASC不仅使用参考向量进行矢量量化,还使用网络拓扑来提取数据集中数据点之间的拓扑关系。带GNG的ASC根据参考向量和GNG生成的网络拓扑来计算相似性矩阵。通过使用GNG从数据集生成的网络,带GNG的ASC实现了降低计算和空间复杂度,并提高了聚类质量。在本研究中,我证明了带GNG的ASC有效减少了计算时间。此外,本研究表明,带GNG的ASC提供了与SC相当或更好的聚类性能。