Lv Juncheng, Kang Zhao, Lu Xiao, Xu Zenglin
IEEE Trans Image Process. 2021;30:5252-5263. doi: 10.1109/TIP.2021.3079800. Epub 2021 May 31.
Auto-Encoder (AE)-based deep subspace clustering (DSC) methods have achieved impressive performance due to the powerful representation extracted using deep neural networks while prioritizing categorical separability. However, self-reconstruction loss of an AE ignores rich useful relation information and might lead to indiscriminative representation, which inevitably degrades the clustering performance. It is also challenging to learn high-level similarity without feeding semantic labels. Another unsolved problem facing DSC is the huge memory cost due to n×n similarity matrix, which is incurred by the self-expression layer between an encoder and decoder. To tackle these problems, we use pairwise similarity to weigh the reconstruction loss to capture local structure information, while a similarity is learned by the self-expression layer. Pseudo-graphs and pseudo-labels, which allow benefiting from uncertain knowledge acquired during network training, are further employed to supervise similarity learning. Joint learning and iterative training facilitate to obtain an overall optimal solution. Extensive experiments on benchmark datasets demonstrate the superiority of our approach. By combining with the k -nearest neighbors algorithm, we further show that our method can address the large-scale and out-of-sample problems. The source code of our method is available: https://github.com/sckangz/SelfsupervisedSC.
基于自动编码器(AE)的深度子空间聚类(DSC)方法,由于利用深度神经网络提取了强大的表示,同时优先考虑类别可分性,已取得了令人瞩目的性能。然而,自动编码器的自重构损失忽略了丰富有用的关系信息,可能导致无区分性的表示,这不可避免地会降低聚类性能。在不输入语义标签的情况下学习高级相似性也具有挑战性。深度子空间聚类面临的另一个未解决问题是,由于编码器和解码器之间的自表达层产生的n×n相似性矩阵,导致巨大的内存成本。为了解决这些问题,我们使用成对相似性来权衡重构损失,以捕获局部结构信息,同时通过自表达层学习相似性。伪图和伪标签可从网络训练期间获得的不确定知识中受益,进一步用于监督相似性学习。联合学习和迭代训练有助于获得整体最优解。在基准数据集上的大量实验证明了我们方法的优越性。通过与k近邻算法相结合,我们进一步表明我们的方法可以解决大规模和样本外问题。我们方法的源代码可在以下网址获取:https://github.com/sckangz/SelfsupervisedSC。