Suppr超能文献

利用 fastglmpca 加速单细胞 RNA 测序数据的降维。

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

机构信息

Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, United States.

Department of Data Science, Dana Farber Cancer Institute, Boston, MA 02215, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae494.

Abstract

SUMMARY

Motivated by theoretical and practical issues that arise when applying Principal component analysis (PCA) to count data, Townes et al. introduced "Poisson GLM-PCA", a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (scRNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call "Alternating Poisson Regression" (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large scRNA-seq datasets. We illustrate the benefits of this approach in three publicly available scRNA-seq datasets. The new algorithms are implemented in an R package, fastglmpca.

AVAILABILITY AND IMPLEMENTATION

The fastglmpca R package is released on CRAN for Windows, macOS and Linux, and the source code is available at github.com/stephenslab/fastglmpca under the open source GPL-3 license. Scripts to reproduce the results in this paper are also available in the GitHub repository and on Zenodo.

摘要

摘要

为了解决将主成分分析(PCA)应用于计数数据时出现的理论和实际问题,Townes 等人引入了“泊松 GLM-PCA”,这是一种适用于计数数据的 PCA 变体,可作为单细胞 RNA 测序(scRNA-seq)数据降维的工具。然而,拟合 GLM-PCA 在计算上具有挑战性。在这里,我们研究了这个问题,并表明我们称之为“交替泊松回归”(APR)的简单算法可以产生更好的拟合质量,并且时间更短,优于现有的算法。APR 还具有高效的内存使用,并且易于在多核处理器上进行并行实现,这两者都有助于处理大型 scRNA-seq 数据集。我们在三个公开可用的 scRNA-seq 数据集上说明了这种方法的好处。新算法已在 R 包 fastglmpca 中实现,可用于 Windows、macOS 和 Linux,源代码可在 github.com/stephenslab/fastglmpca 上获得,遵循开源 GPL-3 许可证。本文结果的重现脚本也可在 GitHub 存储库和 Zenodo 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a9d/11322042/4a2679d0755a/btae494f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验