Suppr超能文献

一种基于计数的快速高效矩阵分解方法,用于从单细胞RNA测序数据中检测细胞类型。

A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data.

作者信息

Sun Shiquan, Chen Yabo, Liu Yang, Shang Xuequn

机构信息

School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, 710129, People's Republic of China.

Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, Shaanxi, 710129, People's Republic of China.

出版信息

BMC Syst Biol. 2019 Apr 5;13(Suppl 2):28. doi: 10.1186/s12918-019-0699-6.

Abstract

BACKGROUND

Single-cell RNA sequencing (scRNAseq) data always involves various unwanted variables, which would be able to mask the true signal to identify cell-types. More efficient way of dealing with this issue is to extract low dimension information from high dimensional gene expression data to represent cell-type structure. In the past two years, several powerful matrix factorization tools were developed for scRNAseq data, such as NMF, ZIFA, pCMF and ZINB-WaVE. But the existing approaches either are unable to directly model the raw count of scRNAseq data or are really time-consuming when handling a large number of cells (e.g. n>500).

RESULTS

In this paper, we developed a fast and efficient count-based matrix factorization method (single-cell negative binomial matrix factorization, scNBMF) based on the TensorFlow framework to infer the low dimensional structure of cell types. To make our method scalable, we conducted a series of experiments on three public scRNAseq data sets, brain, embryonic stem, and pancreatic islet. The experimental results show that scNBMF is more powerful to detect cell types and 10 - 100 folds faster than the scRNAseq bespoke tools.

CONCLUSIONS

In this paper, we proposed a fast and efficient count-based matrix factorization method, scNBMF, which is more powerful for detecting cell type purposes. A series of experiments were performed on three public scRNAseq data sets. The results show that scNBMF is a more powerful tool in large-scale scRNAseq data analysis. scNBMF was implemented in R and Python, and the source code are freely available at https://github.com/sqsun .

摘要

背景

单细胞RNA测序(scRNAseq)数据总是涉及各种不必要的变量,这些变量能够掩盖用于识别细胞类型的真实信号。处理此问题的更有效方法是从高维基因表达数据中提取低维信息以表示细胞类型结构。在过去两年中,针对scRNAseq数据开发了几种强大的矩阵分解工具,如NMF、ZIFA、pCMF和ZINB-WaVE。但现有方法要么无法直接对scRNAseq数据的原始计数进行建模,要么在处理大量细胞(例如n>500)时非常耗时。

结果

在本文中,我们基于TensorFlow框架开发了一种快速高效的基于计数的矩阵分解方法(单细胞负二项式矩阵分解,scNBMF),以推断细胞类型的低维结构。为了使我们的方法具有可扩展性,我们在三个公开的scRNAseq数据集(大脑、胚胎干细胞和胰岛)上进行了一系列实验。实验结果表明scNBMF在检测细胞类型方面更强大,并且比scRNAseq定制工具快10到100倍。

结论

在本文中,我们提出了一种快速高效的基于计数的矩阵分解方法scNBMF,它在检测细胞类型方面更强大。我们在三个公开的scRNAseq数据集上进行了一系列实验。结果表明scNBMF是大规模scRNAseq数据分析中更强大的工具。scNBMF已用R和Python实现,其源代码可在https://github.com/sqsun上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae32/6449882/eccd6c497003/12918_2019_699_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验