School of Mathematics and Statistics, Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan 430079, China.
College of Information Engineering, Shenzhen University, Shenzhen 518060, China.
Bioinformatics. 2020 May 1;36(10):3131-3138. doi: 10.1093/bioinformatics/btaa108.
Single-cell RNA sequencing (scRNA-seq) methods make it possible to reveal gene expression patterns at single-cell resolution. Due to technical defects, dropout events in scRNA-seq will add noise to the gene-cell expression matrix and hinder downstream analysis. Therefore, it is important for recovering the true gene expression levels before carrying out downstream analysis.
In this article, we develop an imputation method, called scTSSR, to recover gene expression for scRNA-seq. Unlike most existing methods that impute dropout events by borrowing information across only genes or cells, scTSSR simultaneously leverages information from both similar genes and similar cells using a two-side sparse self-representation model. We demonstrate that scTSSR can effectively capture the Gini coefficients of genes and gene-to-gene correlations observed in single-molecule RNA fluorescence in situ hybridization (smRNA FISH). Down-sampling experiments indicate that scTSSR performs better than existing methods in recovering the true gene expression levels. We also show that scTSSR has a competitive performance in differential expression analysis, cell clustering and cell trajectory inference.
The R package is available at https://github.com/Zhangxf-ccnu/scTSSR.
Supplementary data are available at Bioinformatics online.
单细胞 RNA 测序 (scRNA-seq) 方法使我们能够以单细胞分辨率揭示基因表达模式。由于技术缺陷,scRNA-seq 中的缺失事件会给基因-细胞表达矩阵添加噪声,并阻碍下游分析。因此,在进行下游分析之前,恢复真实的基因表达水平非常重要。
在本文中,我们开发了一种称为 scTSSR 的插补方法,用于恢复 scRNA-seq 的基因表达。与大多数仅通过基因或细胞跨信息借用来推断缺失事件的现有方法不同,scTSSR 使用双边稀疏自表示模型同时利用来自相似基因和相似细胞的信息。我们证明 scTSSR 可以有效地捕获单分子 RNA 荧光原位杂交 (smRNA FISH) 中观察到的基因的基尼系数和基因间相关性。下采样实验表明,scTSSR 在恢复真实基因表达水平方面优于现有方法。我们还表明,scTSSR 在差异表达分析、细胞聚类和细胞轨迹推断方面具有竞争力。
R 包可在 https://github.com/Zhangxf-ccnu/scTSSR 获得。
补充数据可在生物信息学在线获得。