• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Tweedie 模型的单细胞 RNA-seq 数据差异表达分析。

Differential expression of single-cell RNA-seq data using Tweedie models.

机构信息

Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, Rahway, New Jersey, USA.

Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, USA.

出版信息

Stat Med. 2022 Aug 15;41(18):3492-3510. doi: 10.1002/sim.9430. Epub 2022 Jun 2.

DOI:10.1002/sim.9430
PMID:35656596
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9288986/
Abstract

The performance of computational methods and software to identify differentially expressed features in single-cell RNA-sequencing (scRNA-seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA-seq expression features. To model the technological variability in cross-platform scRNA-seq data, here we propose to use Tweedie generalized linear models that can flexibly capture a large dynamic range of observed scRNA-seq expression profiles across experimental platforms induced by platform- and gene-specific statistical properties such as heavy tails, sparsity, and gene expression distributions. We also propose a zero-inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero-inflated scRNA-seq data with excessive zero counts. Using both synthetic and published plate- and droplet-based scRNA-seq datasets, we perform a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state-of-the-art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open-source software (R/Bioconductor package) is available at https://github.com/himelmallick/Tweedieverse.

摘要

计算方法和软件在单细胞 RNA 测序(scRNA-seq)中识别差异表达特征的性能已被证明受到多种因素的影响,包括所使用的归一化方法的选择以及用于在单个细胞中分析基因表达的实验平台(或文库制备方案)的选择。目前,实践人员可以从目前为止可用的 100 多种差异表达(DE)工具中选择最合适的 DE 方法,每种方法都依赖于自己的假设来对 scRNA-seq 表达特征进行建模。为了对跨平台 scRNA-seq 数据中的技术变异性进行建模,我们在这里建议使用 Tweedie 广义线性模型,该模型可以灵活地捕获跨实验平台的观察到的 scRNA-seq 表达谱的大范围动态范围,这些表达谱由平台和基因特异性统计特性(如重尾、稀疏性和基因表达分布)引起。我们还提出了一个零膨胀 Tweedie 模型,允许零概率质量超过传统的 Tweedie 分布,以对具有过多零计数的零膨胀 scRNA-seq 数据进行建模。我们使用合成和已发表的基于板和基于液滴的 scRNA-seq 数据集,对 10 多种代表性的 DE 方法进行了系统的基准评估,并证明我们的方法(Tweedieverse)在统计功效和假发现率控制方面优于跨实验平台的最先进的 DE 方法。我们的开源软件(R / Bioconductor 包)可在 https://github.com/himelmallick/Tweedieverse 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/133d1eefcb72/nihms-1801933-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/4b3db1316277/nihms-1801933-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/09a6c9485f33/nihms-1801933-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/2ecffcee4a9b/nihms-1801933-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/716b7b9d56aa/nihms-1801933-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/4d3315f7aa7c/nihms-1801933-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/133d1eefcb72/nihms-1801933-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/4b3db1316277/nihms-1801933-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/09a6c9485f33/nihms-1801933-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/2ecffcee4a9b/nihms-1801933-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/716b7b9d56aa/nihms-1801933-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/4d3315f7aa7c/nihms-1801933-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e070/9288986/133d1eefcb72/nihms-1801933-f0006.jpg

相似文献

1
Differential expression of single-cell RNA-seq data using Tweedie models.基于 Tweedie 模型的单细胞 RNA-seq 数据差异表达分析。
Stat Med. 2022 Aug 15;41(18):3492-3510. doi: 10.1002/sim.9430. Epub 2022 Jun 2.
2
SwarnSeq: An improved statistical approach for differential expression analysis of single-cell RNA-seq data.SwarnSeq:一种用于单细胞 RNA-seq 数据差异表达分析的改进统计方法。
Genomics. 2021 May;113(3):1308-1324. doi: 10.1016/j.ygeno.2021.02.014. Epub 2021 Mar 1.
3
A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments.一种灵活的计数数据模型,可适用于广泛复制的 RNA-seq 实验所产生的广泛多样化的表达谱。
BMC Bioinformatics. 2013 Aug 21;14:254. doi: 10.1186/1471-2105-14-254.
4
Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications.观测权重为零膨胀和单细胞应用解锁了批量 RNA-seq 工具。
Genome Biol. 2018 Feb 26;19(1):24. doi: 10.1186/s13059-018-1406-4.
5
A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。
Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.
6
DECENT: differential expression with capture efficiency adjustmeNT for single-cell RNA-seq data.DECENT:用于单细胞 RNA-seq 数据的捕获效率调整的差异表达分析。
Bioinformatics. 2019 Dec 15;35(24):5155-5162. doi: 10.1093/bioinformatics/btz453.
7
GE-Impute: graph embedding-based imputation for single-cell RNA-seq data.GE-Impute:基于图嵌入的单细胞 RNA-seq 数据插补。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac313.
8
TWO-SIGMA: A novel two-component single cell model-based association method for single-cell RNA-seq data.双西格玛:一种新型基于双组份单细胞模型的单细胞 RNA-seq 数据关联方法。
Genet Epidemiol. 2021 Mar;45(2):142-153. doi: 10.1002/gepi.22361. Epub 2020 Sep 29.
9
scAMZI: attention-based deep autoencoder with zero-inflated layer for clustering scRNA-seq data.scAMZI:用于scRNA序列数据聚类的带零膨胀层的基于注意力的深度自动编码器。
BMC Genomics. 2025 Apr 7;26(1):350. doi: 10.1186/s12864-025-11511-2.
10
ZIAQ: a quantile regression method for differential expression analysis of single-cell RNA-seq data.ZIAQ:一种用于单细胞 RNA-seq 数据差异表达分析的分位数回归方法。
Bioinformatics. 2020 May 1;36(10):3124-3130. doi: 10.1093/bioinformatics/btaa098.

引用本文的文献

1
scCOSMIX: A Mixed-Effects Framework for Differential Coexpression and Transcriptional Interactions Modeling in Single-Cell RNA-Seq.scCOSMIX:用于单细胞RNA测序中差异共表达和转录相互作用建模的混合效应框架
Stat Med. 2025 Aug;44(18-19):e70213. doi: 10.1002/sim.70213.
2
Fibroblast-Mediated Macrophage Recruitment Supports Acute Wound Healing.成纤维细胞介导的巨噬细胞募集促进急性伤口愈合。
J Invest Dermatol. 2024 Nov 22. doi: 10.1016/j.jid.2024.10.609.
3
Deep skin fibroblast-mediated macrophage recruitment supports acute wound healing.深层皮肤成纤维细胞介导的巨噬细胞募集有助于急性伤口愈合。

本文引用的文献

1
Evidence for oligodendrocyte progenitor cell heterogeneity in the adult mouse brain.成年鼠脑中少突胶质前体细胞的异质性证据。
Sci Rep. 2022 Jul 28;12(1):12921. doi: 10.1038/s41598-022-17081-7.
2
IDEAS: individual level differential expression analysis for single-cell RNA-seq data.IDEAS:单细胞 RNA-seq 数据的个体水平差异表达分析。
Genome Biol. 2022 Jan 24;23(1):33. doi: 10.1186/s13059-022-02605-1.
3
SCRIP: an accurate simulator for single-cell RNA sequencing data.SCRIP:单细胞 RNA 测序数据的精确模拟器。
bioRxiv. 2024 Aug 10:2024.08.09.607357. doi: 10.1101/2024.08.09.607357.
4
Comparative study on differential expression analysis methods for single-cell RNA sequencing data with small biological replicates: Based on single-cell transcriptional data of PBMCs from COVID-19 severe patients.基于 COVID-19 重症患者 PBMCs 的单细胞转录组数据,对具有小生物学重复的单细胞 RNA 测序数据差异表达分析方法进行比较研究。
PLoS One. 2024 Mar 27;19(3):e0299358. doi: 10.1371/journal.pone.0299358. eCollection 2024.
5
eSVD-DE: cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings.eSVD-DE:使用指数族嵌入进行单细胞 RNA-seq 数据的全队列差异表达分析。
BMC Bioinformatics. 2024 Mar 15;25(1):113. doi: 10.1186/s12859-024-05724-7.
6
Hospital antimicrobial stewardship: profiling the oral microbiome after exposure to COVID-19 and antibiotics.医院抗菌药物管理:新冠病毒感染及抗生素暴露后口腔微生物群特征分析
Front Microbiol. 2024 Feb 27;15:1346762. doi: 10.3389/fmicb.2024.1346762. eCollection 2024.
7
Dietary resistant starch supplementation increases gut luminal deoxycholic acid abundance in mice.膳食补充抗性淀粉可增加小鼠肠道腔内脱氧胆酸的含量。
Gut Microbes. 2024 Jan-Dec;16(1):2315632. doi: 10.1080/19490976.2024.2315632. Epub 2024 Feb 20.
8
eSVD-DE: Cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings.eSVD-DE:使用指数族嵌入的单细胞RNA测序数据中的全队列差异表达
bioRxiv. 2024 Mar 1:2023.11.22.568369. doi: 10.1101/2023.11.22.568369.
9
Inferring Cell-Cell Communications from Spatially Resolved Transcriptomics Data Using a Bayesian Tweedie Model.基于贝叶斯 Tweedie 模型从空间分辨转录组学数据推断细胞间通讯。
Genes (Basel). 2023 Jun 28;14(7):1368. doi: 10.3390/genes14071368.
10
Construction and validation of a prognostic signature based on necroptosis-related genes in hepatocellular carcinoma.基于肝细胞癌中坏死性凋亡相关基因构建和验证预后签名。
PLoS One. 2023 Feb 16;18(2):e0279744. doi: 10.1371/journal.pone.0279744. eCollection 2023.
Bioinformatics. 2022 Feb 7;38(5):1304-1311. doi: 10.1093/bioinformatics/btab824.
4
Multivariable association discovery in population-scale meta-omics studies.基于人群的宏基因组学研究中的多变量关联发现。
PLoS Comput Biol. 2021 Nov 16;17(11):e1009442. doi: 10.1371/journal.pcbi.1009442. eCollection 2021 Nov.
5
Statistical approaches for differential expression analysis in metatranscriptomics.用于宏转录组学中差异表达分析的统计方法。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i34-i41. doi: 10.1093/bioinformatics/btab327.
6
Bayesian modeling of spatial molecular profiling data via Gaussian process.基于高斯过程的空间分子剖析数据的贝叶斯建模。
Bioinformatics. 2021 Nov 18;37(22):4129-4136. doi: 10.1093/bioinformatics/btab455.
7
Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis.分离测量和表达模型可澄清单细胞 RNA 测序分析中的混淆。
Nat Genet. 2021 Jun;53(6):770-777. doi: 10.1038/s41588-021-00873-4. Epub 2021 May 24.
8
UMI or not UMI, that is the question for scRNA-seq zero-inflation.是否使用唯一分子标识符(UMI),这是单细胞RNA测序(scRNA-seq)零膨胀问题的关键所在。
Nat Biotechnol. 2021 Feb;39(2):158-159. doi: 10.1038/s41587-020-00810-6. Epub 2021 Feb 1.
9
TWO-SIGMA: A novel two-component single cell model-based association method for single-cell RNA-seq data.双西格玛:一种新型基于双组份单细胞模型的单细胞 RNA-seq 数据关联方法。
Genet Epidemiol. 2021 Mar;45(2):142-153. doi: 10.1002/gepi.22361. Epub 2020 Sep 29.
10
Sequence count data are poorly fit by the negative binomial distribution.序列计数数据不适用于负二项分布。
PLoS One. 2020 Apr 30;15(4):e0224909. doi: 10.1371/journal.pone.0224909. eCollection 2020.