Department of Computing, Imperial College London, London, United Kingdom.
Google Research, Mountain View, CA.
IEEE Trans Pattern Anal Mach Intell. 2017 Mar;39(3):417-429. doi: 10.1109/TPAMI.2016.2554555. Epub 2016 Apr 15.
Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies cannot interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.
半非负矩阵分解是一种技术,它可以学习数据集的低维表示,从而可以进行聚类解释。在这种新的表示形式和我们原始数据矩阵之间的映射中,可能包含相当复杂的层次信息,具有隐含的低级隐藏属性,而传统的一级聚类方法无法解释。在这项工作中,我们提出了一种新的模型,即深度半非负矩阵分解(Deep Semi-NMF),它能够学习到这样的隐藏表示,这些表示允许根据给定数据集的不同、未知属性进行聚类解释。我们还提出了算法的半监督版本,称为深度 WSF(Deep WSF),它允许为数据集的每个已知属性使用(部分)先验信息,这使得模型能够用于具有混合属性知识的数据集。最后,我们表明,我们的模型能够学习更适合聚类的低维表示,并且在聚类、分类方面表现优于半非负矩阵分解,也优于其他最新的方法变体。