使用约束鲁棒非负矩阵分解对单细胞RNA测序数据进行降维

Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization.

作者信息

Zhang Shuqin, Yang Liu, Yang Jinwen, Lin Zhixiang, Ng Michael K

机构信息

School of Mathematical Sciences, Fudan University, Shanghai 200433, China.

College of Intelligence and Computing, Tianjin University, Tianjin 300350, China.

出版信息

NAR Genom Bioinform. 2020 Aug 28;2(3):lqaa064. doi: 10.1093/nargab/lqaa064. eCollection 2020 Sep.

DOI:10.1093/nargab/lqaa064

PMID:33575614

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7671375/

Abstract

Single cell RNA-sequencing (scRNA-seq) technology, a powerful tool for analyzing the entire transcriptome at single cell level, is receiving increasing research attention. The presence of dropouts is an important characteristic of scRNA-seq data that may affect the performance of downstream analyses, such as dimensionality reduction and clustering. Cells sequenced to lower depths tend to have more dropouts than those sequenced to greater depths. In this study, we aimed to develop a dimensionality reduction method to address both dropouts and the non-negativity constraints in scRNA-seq data. The developed method simultaneously performs dimensionality reduction and dropout imputation under the non-negative matrix factorization (NMF) framework. The dropouts were modeled as a non-negative sparse matrix. Summation of the observed data matrix and dropout matrix was approximated by NMF. To ensure the sparsity pattern was maintained, a weighted ℓ penalty that took into account the dependency of dropouts on the sequencing depth in each cell was imposed. An efficient algorithm was developed to solve the proposed optimization problem. Experiments using both synthetic data and real data showed that dimensionality reduction via the proposed method afforded more robust clustering results compared with those obtained from the existing methods, and that dropout imputation improved the differential expression analysis.

摘要

单细胞RNA测序（scRNA-seq）技术作为一种在单细胞水平分析整个转录组的强大工具，正受到越来越多的研究关注。缺失值的存在是scRNA-seq数据的一个重要特征，可能会影响下游分析的性能，如降维和聚类。测序深度较低的细胞往往比测序深度较高的细胞有更多的缺失值。在本研究中，我们旨在开发一种降维方法，以解决scRNA-seq数据中的缺失值和非负约束问题。所开发的方法在非负矩阵分解（NMF）框架下同时进行降维和缺失值插补。将缺失值建模为一个非负稀疏矩阵。通过NMF对观测数据矩阵和缺失值矩阵的和进行近似。为了确保稀疏模式得以维持，施加了一种加权ℓ惩罚，该惩罚考虑了每个细胞中缺失值对测序深度的依赖性。开发了一种高效算法来解决所提出的优化问题。使用合成数据和真实数据进行的实验表明，与现有方法相比，通过所提出的方法进行降维能得到更稳健的聚类结果，并且缺失值插补改善了差异表达分析。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用约束鲁棒非负矩阵分解对单细胞RNA测序数据进行降维

Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

使用约束鲁棒非负矩阵分解对单细胞RNA测序数据进行降维

Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献