• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

cellsig 插件通过稀疏多级建模增强了 CIBERSORTx 签名在多数据集转录组中的选择。

cellsig plug-in enhances CIBERSORTx signature selection for multidataset transcriptomes with sparse multilevel modelling.

机构信息

Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Parkville, VIC 3010, Australia.

Cancer Biology And Therapy, Olivia Newton-John Cancer Research Institute, Heidelberg, VIC 3038, Australia.

出版信息

Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad685.

DOI:10.1093/bioinformatics/btad685
PMID:37952182
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10692870/
Abstract

MOTIVATION

The precise characterization of cell-type transcriptomes is pivotal to understanding cellular lineages, deconvolution of bulk transcriptomes, and clinical applications. Single-cell RNA sequencing resources like the Human Cell Atlas have revolutionised cell-type profiling. However, challenges persist due to data heterogeneity and discrepancies across different studies. One limitation of prevailing tools such as CIBERSORTx is their inability to address hierarchical data structures and handle nonoverlapping gene sets across samples, relying on filtering or imputation.

RESULTS

Here, we present cellsig, a Bayesian sparse multilevel model designed to improve signature estimation by adjusting data for multilevel effects and modelling for gene-set sparsity. Our model is tailored to large-scale, heterogeneous pseudobulk and bulk RNA sequencing data collections with nonoverlapping gene sets. We tested the performances of cellsig on a novel curated Human Bulk Cell-type Catalogue, which harmonizes 1435 samples across 58 datasets. We show that cellsig significantly enhances cell-type marker gene ranking performance. This approach is valuable for cell-type signature selection, with implications for marker gene validation, single-cell annotation, and deconvolution benchmarks.

AVAILABILITY AND IMPLEMENTATION

Codes and the interactive app are available at https://github.com/stemangiola/cellsig; and the database is available at https://doi.org/10.5281/zenodo.7582421.

摘要

动机

精确描述细胞类型的转录组对于理解细胞谱系、对大量转录组的反卷积以及临床应用至关重要。像人类细胞图谱这样的单细胞 RNA 测序资源已经彻底改变了细胞类型的分析。然而,由于数据的异质性以及不同研究之间的差异,仍然存在挑战。像 CIBERSORTx 这样的流行工具的一个限制是,它们无法解决层次数据结构的问题,并且无法处理样本之间非重叠的基因集,只能依靠过滤或插补。

结果

在这里,我们提出了 cellsig,这是一种贝叶斯稀疏多层模型,旨在通过调整数据的多层次效应和基因集稀疏性来改善特征估计。我们的模型是针对具有非重叠基因集的大规模、异质的拟似和批量 RNA 测序数据集合量身定制的。我们在一个新的经过精心策划的人类批量细胞类型目录上测试了 cellsig 的性能,该目录协调了 58 个数据集的 1435 个样本。我们表明,cellsig 显著提高了细胞类型标记基因的排名性能。这种方法对于细胞类型特征选择很有价值,对标记基因验证、单细胞注释和反卷积基准测试具有重要意义。

可用性和实施

代码和交互式应用程序可在 https://github.com/stemangiola/cellsig 上获得;数据库可在 https://doi.org/10.5281/zenodo.7582421 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/a3f85caf9034/btad685f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/b2a0c9835f05/btad685f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/f8056365bb49/btad685f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/15136a36d299/btad685f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/61fc85431c22/btad685f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/a3f85caf9034/btad685f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/b2a0c9835f05/btad685f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/f8056365bb49/btad685f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/15136a36d299/btad685f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/61fc85431c22/btad685f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac3/10692870/a3f85caf9034/btad685f5.jpg

相似文献

1
cellsig plug-in enhances CIBERSORTx signature selection for multidataset transcriptomes with sparse multilevel modelling.cellsig 插件通过稀疏多级建模增强了 CIBERSORTx 签名在多数据集转录组中的选择。
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad685.
2
Tissue-specific deconvolution of immune cell composition by integrating bulk and single-cell transcriptomes.通过整合批量和单细胞转录组,对免疫细胞组成进行组织特异性去卷积。
Bioinformatics. 2020 Feb 1;36(3):819-827. doi: 10.1093/bioinformatics/btz672.
3
Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods.异质拟时间序列模拟可实现细胞类型去卷积方法的真实基准测试。
Genome Biol. 2024 Jul 1;25(1):169. doi: 10.1186/s13059-024-03292-w.
4
A novel Bayesian framework for harmonizing information across tissues and studies to increase cell type deconvolution accuracy.一种新颖的贝叶斯框架,用于协调跨组织和研究的信息,以提高细胞类型去卷积的准确性。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac616.
5
HArmonized single-cell RNA-seq Cell type Assisted Deconvolution (HASCAD).HArmonized single-cell RNA-seq Cell type Assisted Deconvolution (HASCAD). 协调单细胞 RNA-seq 细胞类型辅助去卷积 (HASCAD)。
BMC Med Genomics. 2023 Oct 31;16(Suppl 2):272. doi: 10.1186/s12920-023-01674-w.
6
ASURAT: functional annotation-driven unsupervised clustering of single-cell transcriptomes.ASURAT:基于功能注释的单细胞转录组无监督聚类。
Bioinformatics. 2022 Sep 15;38(18):4330-4336. doi: 10.1093/bioinformatics/btac541.
7
scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling.scPNMF:稀疏的单细胞基因编码,以方便选择用于靶向基因分析的基因。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i358-i366. doi: 10.1093/bioinformatics/btab273.
8
Imputing dropouts for single-cell RNA sequencing based on multi-objective optimization.基于多目标优化的单细胞RNA测序缺失值插补
Bioinformatics. 2022 Jun 13;38(12):3222-3230. doi: 10.1093/bioinformatics/btac300.
9
SD2: spatially resolved transcriptomics deconvolution through integration of dropout and spatial information.SD2:通过整合缺失数据和空间信息进行空间分辨转录组学去卷积。
Bioinformatics. 2022 Oct 31;38(21):4878-4884. doi: 10.1093/bioinformatics/btac605.
10
Profiling Cell Type Abundance and Expression in Bulk Tissues with CIBERSORTx.使用 CIBERSORTx 分析批量组织中的细胞类型丰度和表达。
Methods Mol Biol. 2020;2117:135-157. doi: 10.1007/978-1-0716-0301-7_7.

引用本文的文献

1
Reduced HLA-I Transcript Levels and Increased Abundance of a CD56 NK Cell Signature Are Associated with Improved Survival in Lower-Grade Gliomas.HLA-I转录水平降低及CD56自然杀伤细胞特征丰度增加与低级别胶质瘤患者生存率提高相关。
Cancers (Basel). 2025 May 5;17(9):1570. doi: 10.3390/cancers17091570.
2
Transcriptional signature of CD56 NK cells predicts favourable prognosis in bladder cancer.CD56自然杀伤细胞的转录特征预示膀胱癌预后良好。
Front Immunol. 2025 Jan 14;15:1474652. doi: 10.3389/fimmu.2024.1474652. eCollection 2024.

本文引用的文献

1
Stan: A Probabilistic Programming Language.斯坦:一种概率编程语言。
J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.
2
A Transcriptional Signature of IL-2 Expanded Natural Killer Cells Predicts More Favorable Prognosis in Bladder Cancer.IL-2 扩增的自然杀伤细胞的转录特征可预测膀胱癌预后更佳。
Front Immunol. 2021 Nov 10;12:724107. doi: 10.3389/fimmu.2021.724107. eCollection 2021.
3
Cell type ontologies of the Human Cell Atlas.人类细胞图谱的细胞类型本体。
Nat Cell Biol. 2021 Nov;23(11):1129-1135. doi: 10.1038/s41556-021-00787-7. Epub 2021 Nov 8.
4
Confronting false discoveries in single-cell differential expression.单细胞差异表达中虚假发现的应对策略。
Nat Commun. 2021 Sep 28;12(1):5692. doi: 10.1038/s41467-021-25960-2.
5
The Ratio of Exhausted to Resident Infiltrating Lymphocytes Is Prognostic for Colorectal Cancer Patient Outcome.耗竭型与驻留型浸润淋巴细胞比值与结直肠癌患者预后相关。
Cancer Immunol Res. 2021 Oct;9(10):1125-1140. doi: 10.1158/2326-6066.CIR-21-0137. Epub 2021 Aug 19.
6
A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast.人类乳腺正常、癌前病变和肿瘤发生状态的单细胞 RNA 表达图谱。
EMBO J. 2021 Jun 1;40(11):e107333. doi: 10.15252/embj.2020107333. Epub 2021 May 5.
7
Probabilistic outlier identification for RNA sequencing generalized linear models.RNA测序广义线性模型的概率异常值识别
NAR Genom Bioinform. 2021 Mar 1;3(1):lqab005. doi: 10.1093/nargab/lqab005. eCollection 2021 Mar.
8
tidybulk: an R tidy framework for modular transcriptomic data analysis.tidybulk:一个用于模块化转录组数据分析的 R tidy 框架。
Genome Biol. 2021 Jan 22;22(1):42. doi: 10.1186/s13059-020-02233-7.
9
Negative binomial mixed models for analyzing longitudinal CD4 count data.用于分析纵向 CD4 计数数据的负二项混合模型。
Sci Rep. 2020 Oct 7;10(1):16742. doi: 10.1038/s41598-020-73883-7.
10
EPIC: A Tool to Estimate the Proportions of Different Cell Types from Bulk Gene Expression Data.EPIC:一种从批量基因表达数据估计不同细胞类型比例的工具。
Methods Mol Biol. 2020;2120:233-248. doi: 10.1007/978-1-0716-0327-7_17.