IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):974-987. doi: 10.1109/TCBB.2017.2665557. Epub 2017 Feb 7.
Non-negative Matrix Factorization (NMF), a classical method for dimensionality reduction, has been applied in many fields. It is based on the idea that negative numbers are physically meaningless in various data-processing tasks. Apart from its contribution to conventional data analysis, the recent overwhelming interest in NMF is due to its newly discovered ability to solve challenging data mining and machine learning problems, especially in relation to gene expression data. This survey paper mainly focuses on research examining the application of NMF to identify differentially expressed genes and to cluster samples, and the main NMF models, properties, principles, and algorithms with its various generalizations, extensions, and modifications are summarized. The experimental results demonstrate the performance of the various NMF algorithms in identifying differentially expressed genes and clustering samples.
非负矩阵分解 (NMF) 是一种经典的降维方法,已被广泛应用于多个领域。它基于一个假设,即在各种数据处理任务中负数是没有物理意义的。除了对常规数据分析的贡献外,最近人们对 NMF 的浓厚兴趣还源于它新发现的解决数据挖掘和机器学习难题的能力,特别是在与基因表达数据相关的问题上。本文主要关注研究 NMF 在识别差异表达基因和聚类样本中的应用,总结了主要的 NMF 模型、性质、原理和算法及其各种推广、扩展和修改。实验结果表明了各种 NMF 算法在识别差异表达基因和聚类样本方面的性能。