Ma Yuanyuan, Hu Xiaohua, He Tingting, Jiang Xingpeng
School of Information Management, Central China Normal University, Wuhan 430079, China.
School of Computer, Central China Normal University, Wuhan 430079, China.
Methods. 2016 Dec 1;111:80-84. doi: 10.1016/j.ymeth.2016.06.017. Epub 2016 Jun 20.
Nonnegative matrix factorization (NMF) has received considerable attention due to its interpretation of observed samples as combinations of different components, and has been successfully used as a clustering method. As an extension of NMF, Symmetric NMF (SNMF) inherits the advantages of NMF. Unlike NMF, however, SNMF takes a nonnegative similarity matrix as an input, and two lower rank nonnegative matrices (H, H) are computed as an output to approximate the original similarity matrix. Laplacian regularization has improved the clustering performance of NMF and SNMF. However, Laplacian regularization (LR), as a classic manifold regularization method, suffers some problems because of its weak extrapolating ability. In this paper, we propose a novel variant of SNMF, called Hessian regularization based symmetric nonnegative matrix factorization (HSNMF), for this purpose. In contrast to Laplacian regularization, Hessian regularization fits the data perfectly and extrapolates nicely to unseen data. We conduct extensive experiments on several datasets including text data, gene expression data and HMP (Human Microbiome Project) data. The results show that the proposed method outperforms other methods, which suggests the potential application of HSNMF in biological data clustering.
非负矩阵分解(NMF)因其将观测样本解释为不同成分的组合而受到广泛关注,并已成功用作一种聚类方法。作为NMF的扩展,对称非负矩阵分解(SNMF)继承了NMF的优点。然而,与NMF不同的是,SNMF以非负相似性矩阵作为输入,并计算出两个低秩非负矩阵(H,H)作为输出以近似原始相似性矩阵。拉普拉斯正则化提高了NMF和SNMF的聚类性能。然而,拉普拉斯正则化(LR)作为一种经典的流形正则化方法,由于其外推能力较弱而存在一些问题。为此,本文提出了一种新的SNMF变体,称为基于海森正则化的对称非负矩阵分解(HSNMF)。与拉普拉斯正则化相比,海森正则化能很好地拟合数据并能很好地外推到未见数据。我们在包括文本数据、基因表达数据和人类微生物组计划(HMP)数据在内的几个数据集上进行了广泛的实验。结果表明,所提出的方法优于其他方法,这表明HSNMF在生物数据聚类中的潜在应用。