Hozumi Yuta, Wei Guo-Wei
Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA.
Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA.
J Comput Appl Math. 2024 Aug 1;445. doi: 10.1016/j.cam.2024.115842. Epub 2024 Feb 19.
Single-cell RNA sequencing (scRNA-seq) is a relatively new technology that has stimulated enormous interest in statistics, data science, and computational biology due to the high dimensionality, complexity, and large scale associated with scRNA-seq data. Nonnegative matrix factorization (NMF) offers a unique approach due to its meta-gene interpretation of resulting low-dimensional components. However, NMF approaches suffer from the lack of multiscale analysis. This work introduces two persistent Laplacian regularized NMF methods, namely, topological NMF (TNMF) and robust topological NMF (rTNMF). By employing a total of 12 datasets, we demonstrate that the proposed TNMF and rTNMF significantly outperform all other NMF-based methods. We have also utilized TNMF and rTNMF for the visualization of popular Uniform Manifold Approximation and Projection (UMAP) and -distributed stochastic neighbor embedding (-SNE).
单细胞RNA测序(scRNA-seq)是一项相对较新的技术,由于scRNA-seq数据具有高维度、复杂性和大规模的特点,它激发了统计学、数据科学和计算生物学领域的极大兴趣。非负矩阵分解(NMF)因其对所得低维成分的元基因解释而提供了一种独特的方法。然而,NMF方法缺乏多尺度分析。这项工作引入了两种持久拉普拉斯正则化NMF方法,即拓扑NMF(TNMF)和稳健拓扑NMF(rTNMF)。通过总共使用12个数据集,我们证明所提出的TNMF和rTNMF明显优于所有其他基于NMF的方法。我们还将TNMF和rTNMF用于流行的均匀流形逼近与投影(UMAP)和t-SNE(t分布随机邻域嵌入)的可视化。