• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

glmGamPoi:在单细胞计数数据上拟合 Gamma-Poisson 广义线性模型。

glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data.

机构信息

Genome Biology Unit, EMBL, Heidelberg 69117, Germany.

出版信息

Bioinformatics. 2021 Apr 5;36(24):5701-5702. doi: 10.1093/bioinformatics/btaa1009.

DOI:10.1093/bioinformatics/btaa1009
PMID:33295604
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8023675/
Abstract

MOTIVATION

The Gamma-Poisson distribution is a theoretically and empirically motivated model for the sampling variability of single cell RNA-sequencing counts and an essential building block for analysis approaches including differential expression analysis, principal component analysis and factor analysis. Existing implementations for inferring its parameters from data often struggle with the size of single cell datasets, which can comprise millions of cells; at the same time, they do not take full advantage of the fact that zero and other small numbers are frequent in the data. These limitations have hampered uptake of the model, leaving room for statistically inferior approaches such as logarithm(-like) transformation.

RESULTS

We present a new R package for fitting the Gamma-Poisson distribution to data with the characteristics of modern single cell datasets more quickly and more accurately than existing methods. The software can work with data on disk without having to load them into RAM simultaneously.

AVAILABILITYAND IMPLEMENTATION

The package glmGamPoi is available from Bioconductor for Windows, macOS and Linux, and source code is available on github.com/const-ae/glmGamPoi under a GPL-3 license. The scripts to reproduce the results of this paper are available on github.com/const-ae/glmGamPoi-Paper.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

Gamma-Poisson 分布是一种理论上和经验上都有依据的模型,可用于解释单细胞 RNA 测序计数的抽样变异性,也是包括差异表达分析、主成分分析和因子分析在内的分析方法的重要组成部分。从数据中推断其参数的现有实现方法通常难以处理单细胞数据集的规模,这些数据集可能包含数百万个细胞;同时,它们没有充分利用数据中经常出现零和其他小数字的事实。这些限制阻碍了该模型的采用,为统计上较差的方法(如对数似然变换)留下了空间。

结果

我们提出了一个新的 R 包,用于拟合 Gamma-Poisson 分布,与现有方法相比,它可以更快、更准确地处理具有现代单细胞数据集特征的数据。该软件可以在不将数据同时加载到 RAM 中的情况下在磁盘上处理数据。

可用性和实现

适用于 Windows、macOS 和 Linux 的 Bioconductor 提供了包 glmGamPoi,源代码可在 github.com/const-ae/glmGamPoi 下根据 GPL-3 许可证获得。可在 github.com/const-ae/glmGamPoi-Paper 上获取重现本文结果的脚本。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f3b/8023675/12cd2dedc226/btaa1009f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f3b/8023675/12cd2dedc226/btaa1009f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f3b/8023675/12cd2dedc226/btaa1009f1.jpg

相似文献

1
glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data.glmGamPoi:在单细胞计数数据上拟合 Gamma-Poisson 广义线性模型。
Bioinformatics. 2021 Apr 5;36(24):5701-5702. doi: 10.1093/bioinformatics/btaa1009.
2
Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.利用 fastglmpca 加速单细胞 RNA 测序数据的降维。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae494.
3
Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca.利用fastglmpca对单细胞RNA测序数据进行加速降维
bioRxiv. 2024 Jul 4:2024.03.23.586420. doi: 10.1101/2024.03.23.586420.
4
Beta-Poisson model for single-cell RNA-seq data analyses.单细胞 RNA-seq 数据分析的 Beta-Poisson 模型。
Bioinformatics. 2016 Jul 15;32(14):2128-35. doi: 10.1093/bioinformatics/btw202. Epub 2016 Apr 19.
5
SCell: integrated analysis of single-cell RNA-seq data.SCell:单细胞RNA测序数据的综合分析
Bioinformatics. 2016 Jul 15;32(14):2219-20. doi: 10.1093/bioinformatics/btw201. Epub 2016 Apr 19.
6
scBatch: batch-effect correction of RNA-seq data through sample distance matrix adjustment.scBatch:通过样本距离矩阵调整对 RNA-seq 数据进行批次效应校正。
Bioinformatics. 2020 May 1;36(10):3115-3123. doi: 10.1093/bioinformatics/btaa097.
7
schex avoids overplotting for large single-cell RNA-sequencing datasets.schex 避免在大型单细胞 RNA-seq 数据集上过度绘制。
Bioinformatics. 2020 Apr 1;36(7):2291-2292. doi: 10.1093/bioinformatics/btz907.
8
flexiMAP: a regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data.flexiMAP:一种基于回归的方法,用于在标准 RNA-seq 数据中发现差异的可变多聚腺苷酸化事件。
Bioinformatics. 2021 Jun 16;37(10):1461-1464. doi: 10.1093/bioinformatics/btaa854.
9
bayNorm: Bayesian gene expression recovery, imputation and normalization for single-cell RNA-sequencing data.bayNorm:用于单细胞 RNA-seq 数据的贝叶斯基因表达恢复、插补和标准化。
Bioinformatics. 2020 Feb 15;36(4):1174-1181. doi: 10.1093/bioinformatics/btz726.
10
NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data.NewWave:一个用于单细胞 RNA-seq 数据降维和批次效应去除的可扩展 R/Bioconductor 包。
Bioinformatics. 2022 Apr 28;38(9):2648-2650. doi: 10.1093/bioinformatics/btac149.

引用本文的文献

1
Missing data in single-cell transcriptomes reveals transcriptional shifts.单细胞转录组中的缺失数据揭示了转录变化。
bioRxiv. 2025 Aug 21:2025.08.15.669765. doi: 10.1101/2025.08.15.669765.
2
Real-world study on adverse drug reactions of pembrolizumab in endometrial cancer treatment: insights from the FAERS database.帕博利珠单抗治疗子宫内膜癌的真实世界药物不良反应研究:来自FAERS数据库的见解
Front Pharmacol. 2025 Aug 15;16:1622339. doi: 10.3389/fphar.2025.1622339. eCollection 2025.
3
Heterogeneous Macrophage Activation in Acute Skeletal Muscle Sterile Injury and Model of Muscular Dystrophy.

本文引用的文献

1
Naught all zeros in sequence count data are the same.序列计数数据中的零并非都相同。
Comput Struct Biotechnol J. 2020 Sep 28;18:2789-2798. doi: 10.1016/j.csbj.2020.09.014. eCollection 2020.
2
Droplet scRNA-seq is not zero-inflated.液滴单细胞RNA测序不存在零膨胀问题。
Nat Biotechnol. 2020 Feb;38(2):147-150. doi: 10.1038/s41587-019-0379-5.
3
Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression.使用正则化负二项式回归进行单细胞 RNA-seq 数据的归一化和方差稳定化。
急性骨骼肌无菌性损伤和肌肉萎缩症模型中的异质性巨噬细胞激活
Int J Mol Sci. 2025 Aug 21;26(16):8098. doi: 10.3390/ijms26168098.
4
A spatial single-cell atlas of the claustro-insular region uncovers key regulators of neuronal identity and excitability.岛叶-屏状核区域的空间单细胞图谱揭示了神经元身份和兴奋性的关键调节因子。
Nat Commun. 2025 Aug 22;16(1):7830. doi: 10.1038/s41467-025-63138-2.
5
Early lineage segregation of primary myotubes from secondary myotubes and adult muscle stem cells.初级肌管与次级肌管及成体肌肉干细胞的早期谱系分离。
Nat Commun. 2025 Aug 22;16(1):7858. doi: 10.1038/s41467-025-61767-1.
6
Remimazolam's clinical application and safety: A signal detection analysis based on FAERS data and literature support.瑞马唑仑的临床应用与安全性:基于FAERS数据和文献支持的信号检测分析
PLoS One. 2025 Aug 22;20(8):e0330769. doi: 10.1371/journal.pone.0330769. eCollection 2025.
7
Description of bacterial RNA transcripts detected in - infected cells from peripheral human granulomas.在人外周肉芽肿的感染细胞中检测到的细菌RNA转录本的描述。
Virulence. 2025 Dec;16(1):2547326. doi: 10.1080/21505594.2025.2547326. Epub 2025 Aug 25.
8
Circulating T-cell receptor repertoire for cancer early detection.用于癌症早期检测的循环T细胞受体库
NPJ Precis Oncol. 2025 Jul 19;9(1):245. doi: 10.1038/s41698-025-01036-y.
9
Analysis of changes in intercellular communications in Alzheimer's disease reveals conserved changes in glutamatergic transmission in mice and humans.阿尔茨海默病细胞间通讯变化分析揭示小鼠和人类谷氨酸能传递的保守变化。
Sci Rep. 2025 Jul 19;15(1):26248. doi: 10.1038/s41598-025-10795-4.
10
CXCR4 and CXCR6 dually limit T cell entry into the polyomavirus-infected brain.CXCR4和CXCR6双重限制T细胞进入多瘤病毒感染的大脑。
J Neuroinflammation. 2025 Jun 28;22(1):169. doi: 10.1186/s12974-025-03496-2.
Genome Biol. 2019 Dec 23;20(1):296. doi: 10.1186/s13059-019-1874-1.
4
Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.基于多项模型的单细胞 RNA-Seq 特征选择和降维。
Genome Biol. 2019 Dec 23;20(1):295. doi: 10.1186/s13059-019-1861-6.
5
beachmat: A Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types.beachmat:一个用于从多种 R 矩阵类型访问高通量生物数据的 Bioconductor C++ API。
PLoS Comput Biol. 2018 May 3;14(5):e1006135. doi: 10.1371/journal.pcbi.1006135. eCollection 2018 May.
6
A general and flexible method for signal extraction from single-cell RNA-seq data.一种从单细胞RNA测序数据中提取信号的通用且灵活的方法。
Nat Commun. 2018 Jan 18;9(1):284. doi: 10.1038/s41467-017-02554-5.
7
Why you cannot transform your way out of trouble for small counts.为什么你不能通过小打小闹的方式摆脱困境。
Biometrics. 2018 Mar;74(1):362-368. doi: 10.1111/biom.12728. Epub 2017 May 15.
8
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
9
Validation of noise models for single-cell transcriptomics.单细胞转录组学噪声模型的验证。
Nat Methods. 2014 Jun;11(6):637-40. doi: 10.1038/nmeth.2930. Epub 2014 Apr 20.
10
Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates.使用具有收缩离散估计的拟似然法检测RNA序列数据中的差异表达。
Stat Appl Genet Mol Biol. 2012 Oct 22;11(5):/j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml. doi: 10.1515/1544-6115.1826.