IEEE J Biomed Health Inform. 2024 Aug;28(8):4986-4994. doi: 10.1109/JBHI.2024.3400050. Epub 2024 Aug 6.
The advent of single-cell RNA sequencing (scRNA-seq) has brought forth fresh perspectives on intricate biological processes, revealing the nuances and divergences present among distinct cells. Accurate single-cell analysis is a crucial prerequisite for in-depth investigation into the underlying mechanisms of heterogeneity. Due to various technical noises, like the impact of dropout values, scRNA-seq data remains challenging to interpret. In this work, we propose an unsupervised learning framework for scRNA-seq data analysis (aka Sc-GNNMF). Based on the non-negativity and sparsity of scRNA-seq data, we propose employing graph-regularized non-negative matrix factorization (GNNMF) algorithm for the analysis of scRNA-seq data, which involves estimating cell-cell sparse similarity and gene-gene sparse similarity through Laplacian kernels and p-nearest neighbor graphs ( p-NNG). By assuming intrinsic geometric local invariance, we use a weighted p-nearest known neighbors ( p-NKN) to optimize the scRNA-seq data. The optimized scRNA-seq data then participates in the matrix decomposition process, promoting the closeness of cells with similar types in cell-gene data space and determining a more suitable embedding space for clustering. Sc-GNNMF demonstrates superior performance compared to other methods and maintains satisfactory compatibility and robustness, as evidenced by experiments on 11 real scRNA-seq datasets. Furthermore, Sc-GNNMF yields excellent results in clustering tasks, extracting useful gene markers, and pseudo-temporal analysis.
单细胞 RNA 测序 (scRNA-seq) 的出现为复杂的生物学过程带来了新的视角,揭示了不同细胞之间存在的细微差别和差异。准确的单细胞分析是深入研究异质性潜在机制的关键前提。由于各种技术噪声,如辍学值的影响,scRNA-seq 数据仍然难以解释。在这项工作中,我们提出了一种用于 scRNA-seq 数据分析的无监督学习框架(称为 Sc-GNNMF)。基于 scRNA-seq 数据的非负性和稀疏性,我们提出了通过拉普拉斯核和 p-最近邻图(p-NNG)来估计细胞间稀疏相似性和基因间稀疏相似性的图正则化非负矩阵分解(GNNMF)算法。通过假设内在几何局部不变性,我们使用加权 p-最近已知邻居(p-NKN)来优化 scRNA-seq 数据。优化后的 scRNA-seq 数据然后参与矩阵分解过程,促进细胞-基因数据空间中具有相似类型的细胞之间的接近性,并确定更适合聚类的嵌入空间。Sc-GNNMF 与其他方法相比表现出优越的性能,并在 11 个真实 scRNA-seq 数据集上的实验中表现出令人满意的兼容性和稳健性。此外,Sc-GNNMF 在聚类任务、提取有用的基因标记和伪时间分析中取得了优异的结果。
IEEE J Biomed Health Inform. 2024-8
Brief Bioinform. 2024-9-23
J Chem Inf Model. 2022-12-12
Brief Bioinform. 2024-9-23
PLoS Comput Biol. 2023-11
Brief Bioinform. 2024-9-23
IEEE J Biomed Health Inform. 2022-1
Brief Bioinform. 2024-9-23