scRMD：基于稳健矩阵分解的单细胞 RNA-seq 数据插补。

scRMD: imputation for single cell RNA-seq data via robust matrix decomposition.

机构信息

Department of Probability and Statistics, School of Mathematical Sciences, Peking University, Beijing 100871, China.

Damo Academy, Alibaba Group, Beijing 100029, China.

出版信息

Bioinformatics. 2020 May 1;36(10):3156-3161. doi: 10.1093/bioinformatics/btaa139.

DOI:10.1093/bioinformatics/btaa139

PMID:32119079

Abstract

MOTIVATION

Single cell RNA-sequencing (scRNA-seq) technology enables whole transcriptome profiling at single cell resolution and holds great promises in many biological and medical applications. Nevertheless, scRNA-seq often fails to capture expressed genes, leading to the prominent dropout problem. These dropouts cause many problems in down-stream analysis, such as significant increase of noises, power loss in differential expression analysis and obscuring of gene-to-gene or cell-to-cell relationship. Imputation of these dropout values can be beneficial in scRNA-seq data analysis.

RESULTS

In this article, we model the dropout imputation problem as robust matrix decomposition. This model has minimal assumptions and allows us to develop a computational efficient imputation method called scRMD. Extensive data analysis shows that scRMD can accurately recover the dropout values and help to improve downstream analysis such as differential expression analysis and clustering analysis.

AVAILABILITY AND IMPLEMENTATION

The R package scRMD is available at https://github.com/XiDsLab/scRMD.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞 RNA 测序 (scRNA-seq) 技术能够在单细胞分辨率下进行全转录组谱分析，在许多生物和医学应用中具有很大的应用前景。然而，scRNA-seq 经常无法捕获表达基因，导致明显的缺失问题。这些缺失在下游分析中会导致许多问题，例如噪声显著增加、差异表达分析的功效损失以及基因间或细胞间关系的模糊。在 scRNA-seq 数据分析中，对这些缺失值进行插补可能是有益的。