Suppr超能文献

用于单细胞转录组学的基因集优化

Gene set optimization for single cell transcriptomics.

作者信息

Frost H Robert

机构信息

Dartmouth College, Hanover NH 03755, USA.

出版信息

Comput Intell Methods Bioinform Biostat. 2025;15276:183-195. doi: 10.1007/978-3-031-89704-7_14. Epub 2025 May 15.

Abstract

Although single cell RNA-sequencing (scRNA-seq) provides unprecedented insights into the biology of complex tissues, analyzing such data on a gene-by-gene basis is challenging due to the large number of tested hypotheses and consequent low statistical power and difficult interpretation. These issues are magnified by the increased noise, significant sparsity and multi-modal distributions characteristic of single cell data. One promising approach for addressing these challenges is gene set testing, or pathway analysis. Unfortunately, statistical and biological differences between single cell and bulk transcriptomic data make it challenging to use existing gene set collections, which were developed for bulk tissue analysis, on scRNA-seq data. In this paper, we describe a procedure for customizing gene set collections originally created for bulk tissue analysis to reflect the structure of gene activity within specific cell types. Our approach leverages information about mean gene expression in the 81 human cell types profiled via scRNA-seq by the Human Protein Atlas (HPA) Single Cell Type Atlas. This HPA information is used to compute cell type-specific gene and gene set weights that can be used to filter or weight gene set collections. As demonstrated through the analysis of immune cell scRNA-seq data using gene sets from the Molecular Signatures Database (MSigDB), accounting for cell type-specificity can significantly improve gene set testing power and interpretability.

摘要

尽管单细胞RNA测序(scRNA-seq)为复杂组织的生物学研究提供了前所未有的见解,但由于测试假设数量众多,统计功效低且难以解释,逐基因分析此类数据具有挑战性。单细胞数据的噪声增加、显著稀疏性和多模态分布特性进一步加剧了这些问题。一种有前景的应对这些挑战的方法是基因集测试或通路分析。不幸的是,单细胞和批量转录组数据之间的统计和生物学差异使得难以将为批量组织分析开发的现有基因集用于scRNA-seq数据。在本文中,我们描述了一种定制最初为批量组织分析创建的基因集的程序,以反映特定细胞类型内基因活性的结构。我们的方法利用了通过人类蛋白质图谱(HPA)单细胞类型图谱通过scRNA-seq分析的81种人类细胞类型中的平均基因表达信息。这些HPA信息用于计算细胞类型特异性基因和基因集权重,可用于过滤或加权基因集。通过使用来自分子特征数据库(MSigDB)的基因集分析免疫细胞scRNA-seq数据表明,考虑细胞类型特异性可以显著提高基因集测试的功效和可解释性。

相似文献

1
Gene set optimization for single cell transcriptomics.用于单细胞转录组学的基因集优化
Comput Intell Methods Bioinform Biostat. 2025;15276:183-195. doi: 10.1007/978-3-031-89704-7_14. Epub 2025 May 15.
4
Computation and application of tissue-specific gene set weights.组织特异性基因集权重的计算与应用。
Bioinformatics. 2018 Sep 1;34(17):2957-2964. doi: 10.1093/bioinformatics/bty217.
5
CAMML with the Integration of Marker Proteins (ChIMP).CAMML 与标记蛋白(ChIMP)的整合。
Bioinformatics. 2022 Nov 30;38(23):5206-5213. doi: 10.1093/bioinformatics/btac674.
7
Computational solutions for spatial transcriptomics.空间转录组学的计算解决方案。
Comput Struct Biotechnol J. 2022 Sep 1;20:4870-4884. doi: 10.1016/j.csbj.2022.08.043. eCollection 2022.

本文引用的文献

2
Best practices for single-cell analysis across modalities.多模态单细胞分析的最佳实践。
Nat Rev Genet. 2023 Aug;24(8):550-572. doi: 10.1038/s41576-023-00586-w. Epub 2023 Mar 31.
5
Expression Atlas update: from tissues to single cells.表达图谱更新:从组织到单细胞。
Nucleic Acids Res. 2020 Jan 8;48(D1):D77-D83. doi: 10.1093/nar/gkz947.
6
Computation and application of tissue-specific gene set weights.组织特异性基因集权重的计算与应用。
Bioinformatics. 2018 Sep 1;34(17):2957-2964. doi: 10.1093/bioinformatics/bty217.
7
The Human Cell Atlas.人类细胞图谱
Elife. 2017 Dec 5;6:e27041. doi: 10.7554/eLife.27041.
8
The Human Cell Atlas: Technical approaches and challenges.人类细胞图谱:技术方法与挑战。
Brief Funct Genomics. 2018 Jul 1;17(4):283-294. doi: 10.1093/bfgp/elx029.
9
The Human Cell Atlas: from vision to reality.人类细胞图谱:从愿景到现实。
Nature. 2017 Oct 18;550(7677):451-453. doi: 10.1038/550451a.
10
A subcellular map of the human proteome.人类蛋白质组的亚细胞图谱。
Science. 2017 May 26;356(6340). doi: 10.1126/science.aal3321. Epub 2017 May 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验