• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 fastglmpca 加速单细胞 RNA 测序数据的降维。

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.

机构信息

Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, United States.

Department of Data Science, Dana Farber Cancer Institute, Boston, MA 02215, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae494.

DOI:10.1093/bioinformatics/btae494
PMID:39110511
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11322042/
Abstract

SUMMARY

Motivated by theoretical and practical issues that arise when applying Principal component analysis (PCA) to count data, Townes et al. introduced "Poisson GLM-PCA", a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (scRNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call "Alternating Poisson Regression" (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large scRNA-seq datasets. We illustrate the benefits of this approach in three publicly available scRNA-seq datasets. The new algorithms are implemented in an R package, fastglmpca.

AVAILABILITY AND IMPLEMENTATION

The fastglmpca R package is released on CRAN for Windows, macOS and Linux, and the source code is available at github.com/stephenslab/fastglmpca under the open source GPL-3 license. Scripts to reproduce the results in this paper are also available in the GitHub repository and on Zenodo.

摘要

摘要

为了解决将主成分分析(PCA)应用于计数数据时出现的理论和实际问题,Townes 等人引入了“泊松 GLM-PCA”,这是一种适用于计数数据的 PCA 变体,可作为单细胞 RNA 测序(scRNA-seq)数据降维的工具。然而,拟合 GLM-PCA 在计算上具有挑战性。在这里,我们研究了这个问题,并表明我们称之为“交替泊松回归”(APR)的简单算法可以产生更好的拟合质量,并且时间更短,优于现有的算法。APR 还具有高效的内存使用,并且易于在多核处理器上进行并行实现,这两者都有助于处理大型 scRNA-seq 数据集。我们在三个公开可用的 scRNA-seq 数据集上说明了这种方法的好处。新算法已在 R 包 fastglmpca 中实现,可用于 Windows、macOS 和 Linux,源代码可在 github.com/stephenslab/fastglmpca 上获得,遵循开源 GPL-3 许可证。本文结果的重现脚本也可在 GitHub 存储库和 Zenodo 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a9d/11322042/4a2679d0755a/btae494f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a9d/11322042/4a2679d0755a/btae494f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a9d/11322042/4a2679d0755a/btae494f1.jpg

相似文献

1
Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.利用 fastglmpca 加速单细胞 RNA 测序数据的降维。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae494.
2
Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.利用fastglmpca对单细胞RNA测序数据进行加速降维
bioRxiv. 2024 Jul 4:2024.03.23.586420. doi: 10.1101/2024.03.23.586420.
3
glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data.glmGamPoi:在单细胞计数数据上拟合 Gamma-Poisson 广义线性模型。
Bioinformatics. 2021 Apr 5;36(24):5701-5702. doi: 10.1093/bioinformatics/btaa1009.
4
Benchmarking principal component analysis for large-scale single-cell RNA-sequencing.基于主成分分析的大规模单细胞 RNA-seq 基准测试
Genome Biol. 2020 Jan 20;21(1):9. doi: 10.1186/s13059-019-1900-3.
5
SCell: integrated analysis of single-cell RNA-seq data.SCell:单细胞RNA测序数据的综合分析
Bioinformatics. 2016 Jul 15;32(14):2219-20. doi: 10.1093/bioinformatics/btw201. Epub 2016 Apr 19.
6
scHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation.scHFC:一种基于自然计算优化的单细胞 RNA-seq 数据的混合模糊聚类方法。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab588.
7
scTPC: a novel semisupervised deep clustering model for scRNA-seq data.scTPC:一种用于 scRNA-seq 数据的新型半监督深度聚类模型。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae293.
8
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
9
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
10
Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data.用于单细胞 RNA-seq UMI 数据归一化的解析 Pearson 残差。
Genome Biol. 2021 Sep 6;22(1):258. doi: 10.1186/s13059-021-02451-7.

本文引用的文献

1
FastRNA: An efficient solution for PCA of single-cell RNA-sequencing data based on a batch-accounting count model.FastRNA:基于批处理计数模型的单细胞 RNA-seq 数据主成分分析的有效解决方案。
Am J Hum Genet. 2022 Nov 3;109(11):1974-1985. doi: 10.1016/j.ajhg.2022.09.008. Epub 2022 Oct 6.
2
NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data.NewWave:一个用于单细胞 RNA-seq 数据降维和批次效应去除的可扩展 R/Bioconductor 包。
Bioinformatics. 2022 Apr 28;38(9):2648-2650. doi: 10.1093/bioinformatics/btac149.
3
Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis.
分离测量和表达模型可澄清单细胞 RNA 测序分析中的混淆。
Nat Genet. 2021 Jun;53(6):770-777. doi: 10.1038/s41588-021-00873-4. Epub 2021 May 24.
4
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
5
Benchmarking principal component analysis for large-scale single-cell RNA-sequencing.基于主成分分析的大规模单细胞 RNA-seq 基准测试
Genome Biol. 2020 Jan 20;21(1):9. doi: 10.1186/s13059-019-1900-3.
6
Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.基于多项模型的单细胞 RNA-Seq 特征选择和降维。
Genome Biol. 2019 Dec 23;20(1):295. doi: 10.1186/s13059-019-1861-6.
7
Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis.单细胞 RNA-seq 分析中降维方法的准确性、鲁棒性和可扩展性。
Genome Biol. 2019 Dec 10;20(1):269. doi: 10.1186/s13059-019-1898-6.
8
Orchestrating single-cell analysis with Bioconductor.使用 Bioconductor 进行单细胞分析的协调。
Nat Methods. 2020 Feb;17(2):137-145. doi: 10.1038/s41592-019-0654-x. Epub 2019 Dec 2.
9
Comprehensive Integration of Single-Cell Data.单细胞数据的综合整合。
Cell. 2019 Jun 13;177(7):1888-1902.e21. doi: 10.1016/j.cell.2019.05.031. Epub 2019 Jun 6.
10
Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares.通过快速交替最小二乘法实现矩阵补全与低秩奇异值分解
J Mach Learn Res. 2015;16:3367-3402.