Falai Chen's Lab, China.
Division of Life Sciences and Medicine, USTC, China.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab085.
The low capture rate of expressed RNAs from single-cell sequencing technology is one of the major obstacles to downstream functional genomics analyses. Recently, a number of imputation methods have emerged for single-cell transcriptome data, however, recovering missing values in very sparse expression matrices remains a substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted Decomposition of Gene Expression), to impute gene expression matrices by using a biased low-rank matrix decomposition method. WEDGE successfully recovered expression matrices, reproduced the cell-wise and gene-wise correlations and improved the clustering of cells, performing impressively for applications with sparse datasets. Overall, this study shows a potent approach for imputing sparse expression matrix data, and our WEDGE algorithm should help many researchers to more profitably explore the biological meanings embedded in their single-cell RNA sequencing datasets. The source code of WEDGE has been released at https://github.com/QuKunLab/WEDGE.
单细胞测序技术中表达 RNA 的捕获率低是下游功能基因组学分析的主要障碍之一。最近,已经出现了许多用于单细胞转录组数据的插补方法,然而,在非常稀疏的表达矩阵中恢复缺失值仍然是一个重大挑战。在这里,我们提出了一种新的算法 WEDGE(基因表达的加权分解),通过使用有偏差的低秩矩阵分解方法来插补基因表达矩阵。WEDGE 成功地恢复了表达矩阵,再现了细胞和基因之间的相关性,并改善了细胞聚类,在稀疏数据集的应用中表现出色。总的来说,这项研究展示了一种强大的方法来插补稀疏表达矩阵数据,我们的 WEDGE 算法应该有助于许多研究人员更有效地探索他们的单细胞 RNA 测序数据集所包含的生物学意义。WEDGE 的源代码已经在 https://github.com/QuKunLab/WEDGE 上发布。