Suppr超能文献

通过使用变分自编码器和基于相似度的损失来实现深度聚类。

Achieving deep clustering through the use of variational autoencoders and similarity-based loss.

作者信息

Ma He

机构信息

College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150000, China.

出版信息

Math Biosci Eng. 2022 Jul 22;19(10):10344-10360. doi: 10.3934/mbe.2022484.

Abstract

Clustering is an important and challenging research topic in many fields. Although various clustering algorithms have been developed in the past, traditional shallow clustering algorithms cannot mine the underlying structural information of the data. Recent advances have shown that deep clustering can achieve excellent performance on clustering tasks. In this work, a novel variational autoencoder-based deep clustering algorithm is proposed. It treats the Gaussian mixture model as the prior latent space and uses an additional classifier to distinguish different clusters in the latent space accurately. A similarity-based loss function is proposed consisting specifically of the cross-entropy of the predicted transition probabilities of clusters and the Wasserstein distance of the predicted posterior distributions. The new loss encourages the model to learn meaningful cluster-oriented representations to facilitate clustering tasks. The experimental results show that our method consistently achieves competitive results on various data sets.

摘要

聚类是许多领域中一个重要且具有挑战性的研究课题。尽管过去已经开发了各种聚类算法,但传统的浅层聚类算法无法挖掘数据的潜在结构信息。最近的进展表明,深度聚类在聚类任务上可以取得优异的性能。在这项工作中,提出了一种基于变分自编码器的新型深度聚类算法。它将高斯混合模型视为先验潜在空间,并使用额外的分类器在潜在空间中准确区分不同的聚类。提出了一种基于相似度的损失函数,具体由聚类预测转移概率的交叉熵和预测后验分布的 Wasserstein 距离组成。新的损失鼓励模型学习有意义的面向聚类的表示,以促进聚类任务。实验结果表明,我们的方法在各种数据集上始终取得有竞争力的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验