转录因子和途径分析工具在单细胞 RNA-seq 数据上的稳健性和适用性。

Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data.

机构信息

Institute for Computational Biomedicine, Bioquant, Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Heidelberg, Germany.

Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, Aachen, Germany.

出版信息

Genome Biol. 2020 Feb 12;21(1):36. doi: 10.1186/s13059-020-1949-z.

DOI:10.1186/s13059-020-1949-z

PMID:32051003

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7017576/

Abstract

BACKGROUND

Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.

RESULTS

To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.

CONCLUSIONS

Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.

摘要

背景

许多功能分析工具已经被开发出来，以从批量转录组数据中提取功能和机制见解。随着单细胞 RNA 测序（scRNA-seq）的出现，原则上可以对单细胞进行这样的分析。然而，scRNA-seq 数据具有诸如缺失事件和低文库大小等特征。因此，尚不清楚为批量测序建立的功能 TF 和途径分析工具是否可以以有意义的方式应用于 scRNA-seq。

结果

为了解决这个问题，我们在模拟和真实的 scRNA-seq 数据上进行基准研究。我们包括用于估计途径和转录因子（TF）活性的批量 RNA 工具 PROGENy、GO 富集和 DoRothEA，以及专为 scRNA-seq 设计的工具 SCENIC/AUCell 和 metaVIPER，并将它们进行比较。对于计算机研究，我们从 TF/途径扰动批量 RNA-seq 实验中模拟单细胞。我们用 CRISPR 介导的敲除后的真实 scRNA-seq 数据补充模拟数据。我们在模拟和真实数据上的基准测试结果与原始批量数据的性能相当。此外，我们通过分析用 13 种 scRNA-seq 方案测序的混合样本，表明 TF 和途径活性保留了细胞类型特异性的可变性。我们还提供基准数据供社区进一步使用。

结论

我们的分析表明，基于批量的功能分析工具，使用手动编辑的足迹基因集，可以应用于 scRNA-seq 数据，在某些情况下优于专门的单细胞工具。此外，我们发现功能分析工具的性能对基因集比对统计数据更为敏感。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f79f/7017576/bdcc56f8746f/13059_2020_1949_Fig1_HTML.jpg

相似文献

Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data.转录因子和途径分析工具在单细胞 RNA-seq 数据上的稳健性和适用性。

Genome Biol. 2020 Feb 12;21(1):36. doi: 10.1186/s13059-020-1949-z.

Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data.单细胞 ATAC-seq 数据基因集评分算法的基准测试。

Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzae014.

SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references.SCDC：通过多个单细胞 RNA 测序参考进行批量基因表达去卷积。

Brief Bioinform. 2021 Jan 18;22(1):416-427. doi: 10.1093/bib/bbz166.

A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。

Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.

DeepIMAGER: Deeply Analyzing Gene Regulatory Networks from scRNA-seq Data.DeepIMAGER：从 scRNA-seq 数据中深度分析基因调控网络。

Biomolecules. 2024 Jun 27;14(7):766. doi: 10.3390/biom14070766.

scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.scNPF：一种基于网络传播和网络融合的综合框架，用于单细胞 RNA-seq 数据的预处理。

BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.

Inference of Gene Regulatory Network from Single-Cell Transcriptomic Data Using pySCENIC.基于 pySCENIC 从单细胞转录组数据推断基因调控网络

Methods Mol Biol. 2021;2328:171-182. doi: 10.1007/978-1-0716-1534-8_10.

Benchmarking bulk and single-cell variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries.在 Chromium scRNA-seq 和 scATAC-seq 文库上对批量和单细胞变异调用方法进行基准测试。

Genome Res. 2024 Sep 20;34(8):1196-1210. doi: 10.1101/gr.277066.122.

Detection of high variability in gene expression from single-cell RNA-seq profiling.从单细胞RNA测序分析中检测基因表达的高变异性。

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):508. doi: 10.1186/s12864-016-2897-6.

Identifying cell types to interpret scRNA-seq data: how, why and more possibilities.鉴定细胞类型以解释 scRNA-seq 数据：方法、原因及更多可能性。

Brief Funct Genomics. 2020 Jul 29;19(4):286-291. doi: 10.1093/bfgp/elaa003.

引用本文的文献

Integrated analysis of single-cell RNA-seq and ATAC-seq in lens epithelial cells: Unveiling the role of ATF6 as a key transcription factor.晶状体上皮细胞中单细胞RNA测序和ATAC测序的综合分析：揭示ATF6作为关键转录因子的作用

Genes Dis. 2025 Mar 22;12(6):101610. doi: 10.1016/j.gendis.2025.101610. eCollection 2025 Nov.

GTSE1-expressed osteoblastic cells facilitate formation of pro-metastatic tumor microenvironment in osteosarcoma.表达GTSE1的成骨细胞促进骨肉瘤中促转移肿瘤微环境的形成。

Genes Dis. 2025 Mar 7;12(6):101591. doi: 10.1016/j.gendis.2025.101591. eCollection 2025 Nov.

Application of perturbation gene expression profiles in drug discovery-From mechanism of action to quantitative modelling.扰动基因表达谱在药物发现中的应用——从作用机制到定量建模

Front Syst Biol. 2023 Feb 9;3:1126044. doi: 10.3389/fsysb.2023.1126044. eCollection 2023.

Oligodendrocyte-Specific STAT5B Overexpression Ameliorates Myelin Impairment in Experimental Models of Parkinson's Disease.少突胶质细胞特异性STAT5B过表达改善帕金森病实验模型中的髓鞘损伤。

Cells. 2025 Jul 25;14(15):1145. doi: 10.3390/cells14151145.

Single-cell sequencing uncovers disrupted stromal-macrophage communication as a driver of intrauterine adhesion progression.单细胞测序揭示了基质-巨噬细胞通讯中断是宫腔粘连进展的驱动因素。

Commun Biol. 2025 Aug 11;8(1):1194. doi: 10.1038/s42003-025-08634-3.

Paired single-cell and spatial transcriptional profiling reveals a central osteopontin macrophage response mediating tuberculous granuloma formation.配对的单细胞和空间转录谱分析揭示了一种介导结核性肉芽肿形成的核心骨桥蛋白巨噬细胞反应。

mBio. 2025 Aug 7:e0155925. doi: 10.1128/mbio.01559-25.

Modeling combinatorial regulation from single-cell multi-omics provides regulatory units underpinning cell type landscape using cRegulon.利用单细胞多组学进行组合调控建模，通过cRegulon提供支撑细胞类型格局的调控单元。

Genome Biol. 2025 Jul 24;26(1):220. doi: 10.1186/s13059-025-03680-w.

RIDDEN: Data-driven inference of receptor activity from transcriptomic data.RIDDEN：基于转录组数据的受体活性数据驱动推理

PLoS Comput Biol. 2025 Jun 16;21(6):e1013188. doi: 10.1371/journal.pcbi.1013188. eCollection 2025 Jun.

Accurate Transcription Factor Activity Inference to Decipher Cell Identity from Single-Cell Transcriptomic Data with MetaTF.利用MetaTF从单细胞转录组数据中准确推断转录因子活性以解析细胞身份

Adv Sci (Weinh). 2025 Jun;12(23):e10745. doi: 10.1002/advs.202410745. Epub 2025 May 21.

The Low Tumorigenic Risk and Subtypes of Cardiomyocytes Derived from Human-induced Pluripotent Stem Cells.人诱导多能干细胞来源心肌细胞的低致瘤风险及亚型

Curr Stem Cell Res Ther. 2025;20(3):317-335. doi: 10.2174/011574888X318139240621051224.

本文引用的文献

Footprint-based functional analysis of multiomic data.基于足迹的多组学数据功能分析。

Curr Opin Syst Biol. 2019 Jun;15:82-90. doi: 10.1016/j.coisb.2019.04.002.

Benchmarking single-cell RNA-sequencing protocols for cell atlas projects.单细胞 RNA 测序技术在细胞图谱项目中的基准测试。

Nat Biotechnol. 2020 Jun;38(6):747-755. doi: 10.1038/s41587-020-0469-4. Epub 2020 Apr 6.

Biological plasticity rescues target activity in CRISPR knock outs.生物可塑性可挽救 CRISPR 敲除后的靶标活性。

Nat Methods. 2019 Nov;16(11):1087-1093. doi: 10.1038/s41592-019-0614-5. Epub 2019 Oct 28.

Identifying significantly impacted pathways: a comprehensive review and assessment.识别受显著影响的途径：全面回顾与评估。

Genome Biol. 2019 Oct 9;20(1):203. doi: 10.1186/s13059-019-1790-4.

Transfer of regulatory knowledge from human to mouse for functional genomics analysis.将调控知识从人类转移到小鼠进行功能基因组学分析。

Biochim Biophys Acta Gene Regul Mech. 2020 Jun;1863(6):194431. doi: 10.1016/j.bbagrm.2019.194431. Epub 2019 Sep 13.

Benchmark and integration of resources for the estimation of human transcription factor activities.用于估计人类转录因子活性的资源的基准测试和整合。

Genome Res. 2019 Aug;29(8):1363-1375. doi: 10.1101/gr.240663.118. Epub 2019 Jul 24.

ChEA3: transcription factor enrichment analysis by orthogonal omics integration.ChEA3：通过正交组学整合进行转录因子富集分析。

Nucleic Acids Res. 2019 Jul 2;47(W1):W212-W224. doi: 10.1093/nar/gkz446.

TRAIL-R1 and TRAIL-R2 Mediate TRAIL-Dependent Apoptosis in Activated Primary Human B Lymphocytes.TRAIL-R1 和 TRAIL-R2 介导激活的原代人 B 淋巴细胞中 TRAIL 依赖性凋亡。

Front Immunol. 2019 Apr 30;10:951. doi: 10.3389/fimmu.2019.00951. eCollection 2019.

Next-generation characterization of the Cancer Cell Line Encyclopedia.下一代癌症细胞系百科全书的特征描述。

Nature. 2019 May;569(7757):503-508. doi: 10.1038/s41586-019-1186-3. Epub 2019 May 8.

SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data.SCRABBLE：基于批量 RNA-seq 数据约束的单细胞 RNA-seq 推断。

Genome Biol. 2019 May 6;20(1):88. doi: 10.1186/s13059-019-1681-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

转录因子和途径分析工具在单细胞 RNA-seq 数据上的稳健性和适用性。

Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献