Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, Delhi, India.
Center for Computational Biology, Indraprastha Institute of Information Technology, Delhi, India.
Sci Rep. 2018 Nov 5;8(1):16329. doi: 10.1038/s41598-018-34688-x.
The emergence of single-cell RNA sequencing (scRNA-seq) technologies has enabled us to measure the expression levels of thousands of genes at single-cell resolution. However, insufficient quantities of starting RNA in the individual cells cause significant dropout events, introducing a large number of zero counts in the expression matrix. To circumvent this, we developed an autoencoder-based sparse gene expression matrix imputation method. AutoImpute, which learns the inherent distribution of the input scRNA-seq data and imputes the missing values accordingly with minimal modification to the biologically silent genes. When tested on real scRNA-seq datasets, AutoImpute performed competitively wrt., the existing single-cell imputation methods, on the grounds of expression recovery from subsampled data, cell-clustering accuracy, variance stabilization and cell-type separability.
单细胞 RNA 测序 (scRNA-seq) 技术的出现使我们能够以单细胞分辨率测量数千个基因的表达水平。然而,单个细胞中起始 RNA 的数量不足会导致大量的缺失事件,从而在表达矩阵中引入大量的零计数。为了解决这个问题,我们开发了一种基于自动编码器的稀疏基因表达矩阵插补方法。AutoImpute 学习输入 scRNA-seq 数据的固有分布,并相应地用最小的修改来插补缺失值,从而使生物上沉默的基因不受影响。在真实的 scRNA-seq 数据集上进行测试时,AutoImpute 在从亚采样数据中恢复表达、细胞聚类准确性、方差稳定性和细胞类型可分离性等方面,与现有的单细胞插补方法相比具有竞争力。