CATD：一种用于跨组织选择细胞类型反卷积方法的可重复流程。

CATD: a reproducible pipeline for selecting cell-type deconvolution methods across tissues.

作者信息

Vathrakokoili Pournara Anna, Miao Zhichao, Beker Ozgur Yilimaz, Nolte Nadja, Brazma Alvis, Papatheodorou Irene

机构信息

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.

Open Targets, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom.

出版信息

Bioinform Adv. 2024 Mar 23;4(1):vbae048. doi: 10.1093/bioadv/vbae048. eCollection 2024.

DOI:10.1093/bioadv/vbae048

PMID:38638280

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11023940/

Abstract

MOTIVATION

Cell-type deconvolution methods aim to infer cell composition from bulk transcriptomic data. The proliferation of developed methods coupled with inconsistent results obtained in many cases, highlights the pressing need for guidance in the selection of appropriate methods. Additionally, the growing accessibility of single-cell RNA sequencing datasets, often accompanied by bulk expression from related samples enable the benchmark of existing methods.

RESULTS

In this study, we conduct a comprehensive assessment of 31 methods, utilizing single-cell RNA-sequencing data from diverse human and mouse tissues. Employing various simulation scenarios, we reveal the efficacy of regression-based deconvolution methods, highlighting their sensitivity to reference choices. We investigate the impact of bulk-reference differences, incorporating variables such as sample, study and technology. We provide validation using a gold standard dataset from mononuclear cells and suggest a consensus prediction of proportions when ground truth is not available. We validated the consensus method on data from the stomach and studied its spillover effect. Importantly, we propose the use of the critical assessment of transcriptomic deconvolution (CATD) pipeline which encompasses functionalities for generating references and pseudo-bulks and running implemented deconvolution methods. CATD streamlines simultaneous deconvolution of numerous bulk samples, providing a practical solution for speeding up the evaluation of newly developed methods.

AVAILABILITY AND IMPLEMENTATION

https://github.com/Papatheodorou-Group/CATD_snakemake.

摘要

动机

细胞类型反卷积方法旨在从批量转录组数据中推断细胞组成。已开发方法的激增以及在许多情况下获得的结果不一致，凸显了在选择合适方法时迫切需要指导。此外，单细胞RNA测序数据集的可及性不断提高，通常还伴随着相关样本的批量表达，这使得现有方法的基准测试成为可能。

结果

在本研究中，我们利用来自不同人类和小鼠组织的单细胞RNA测序数据，对31种方法进行了全面评估。通过各种模拟场景，我们揭示了基于回归的反卷积方法的有效性，强调了它们对参考选择的敏感性。我们研究了批量参考差异的影响，纳入了样本、研究和技术等变量。我们使用来自单核细胞的金标准数据集进行了验证，并在无法获得真实情况时提出了比例的共识预测。我们在来自胃的数据上验证了共识方法，并研究了其溢出效应。重要的是，我们提出使用转录组反卷积关键评估（CATD）管道，该管道包含生成参考和伪批量以及运行已实施的反卷积方法的功能。CATD简化了对众多批量样本的同时反卷积，为加速新开发方法的评估提供了一个实用的解决方案。

可用性和实现方式

https://github.com/Papatheodorou-Group/CATD_snakemake 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a02f/11023940/62e09b5e6495/vbae048f1.jpg

相似文献

CATD: a reproducible pipeline for selecting cell-type deconvolution methods across tissues.CATD：一种用于跨组织选择细胞类型反卷积方法的可重复流程。

Bioinform Adv. 2024 Mar 23;4(1):vbae048. doi: 10.1093/bioadv/vbae048. eCollection 2024.

SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition.SimBu：具有可变细胞类型组成的批量 RNA-seq 数据的偏差感知模拟。

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii141-ii147. doi: 10.1093/bioinformatics/btac499.

Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods.异质拟时间序列模拟可实现细胞类型去卷积方法的真实基准测试。

Genome Biol. 2024 Jul 1;25(1):169. doi: 10.1186/s13059-024-03292-w.

Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology.基于转录组的免疫肿瘤学细胞类型定量方法的综合评估。

Bioinformatics. 2019 Jul 15;35(14):i436-i445. doi: 10.1093/bioinformatics/btz363.

SpatialCTD: A Large-Scale Tumor Microenvironment Spatial Transcriptomic Dataset to Evaluate Cell Type Deconvolution for Immuno-Oncology.SpatialCTD：用于评估免疫肿瘤学中细胞类型去卷积的大规模肿瘤微环境空间转录组数据集。

J Comput Biol. 2024 Sep;31(9):871-885. doi: 10.1089/cmb.2024.0532. Epub 2024 Aug 8.

Spotless, a reproducible pipeline for benchmarking cell type deconvolution in spatial transcriptomics.无瑕疵：用于空间转录组学中细胞类型去卷积基准测试的可重现管道。

Elife. 2024 May 24;12:RP88431. doi: 10.7554/eLife.88431.

Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution.通过集成去卷积从组织组学数据中稳健且准确地估计细胞分数。

Bioinformatics. 2022 May 26;38(11):3004-3010. doi: 10.1093/bioinformatics/btac279.

Cell-type deconvolution for bulk RNA-seq data using single-cell reference: a comparative analysis and recommendation guideline.使用单细胞参考对批量RNA测序数据进行细胞类型反卷积：比较分析与推荐指南

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf031.

Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge.用于批量 RNA 测序数据的整体且稳健的去卷积方案，该方案整合了多个单细胞参考集和先验生物学知识。

Bioinformatics. 2022 Sep 30;38(19):4530-4536. doi: 10.1093/bioinformatics/btac563.

SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references.SCDC：通过多个单细胞 RNA 测序参考进行批量基因表达去卷积。

Brief Bioinform. 2021 Jan 18;22(1):416-427. doi: 10.1093/bib/bbz166.

引用本文的文献

Impact of Cattle Breed in scRNA-Seq Reference on Muscle Fiber Type Deconvolution from Bulk RNA-Seq: A Comparison of Software Tools.牛品种对scRNA-Seq参考数据中从批量RNA-Seq进行肌纤维类型反卷积的影响：软件工具比较

BioTech (Basel). 2025 Jul 25;14(3):56. doi: 10.3390/biotech14030056.

Approaching the holistic transcriptome-convolution and deconvolution in transcriptomics.探索转录组学中的整体转录组卷积与反卷积

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf388.

Deconer: An Evaluation Toolkit for Reference-based Deconvolution Methods Using Gene Expression Data.Deconer：一种使用基因表达数据的基于参考的反卷积方法评估工具包。

Genomics Proteomics Bioinformatics. 2025 Feb 18. doi: 10.1093/gpbjnl/qzaf009.

Alleviating batch effects in cell type deconvolution with SCCAF-D.使用SCCAF-D减轻细胞类型反卷积中的批次效应。

Nat Commun. 2024 Dec 30;15(1):10867. doi: 10.1038/s41467-024-55213-x.

Novel Insights into Post-Myocardial Infarction Cardiac Remodeling through Algorithmic Detection of Cell-Type Composition Shifts.通过细胞类型组成变化的算法检测对心肌梗死后心脏重塑的新见解

bioRxiv. 2024 Aug 10:2024.08.09.607400. doi: 10.1101/2024.08.09.607400.

Expression Atlas update: insights from sequencing data at both bulk and single cell level.表达图谱更新：从批量和单细胞水平测序数据中获得的新见解。

Nucleic Acids Res. 2024 Jan 5;52(D1):D107-D114. doi: 10.1093/nar/gkad1021.

本文引用的文献

CZ CELLxGENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data.CZ CELLxGENE发现平台：一个用于对聚合数据进行可扩展探索、分析和建模的单细胞数据平台。

Nucleic Acids Res. 2025 Jan 6;53(D1):D886-D900. doi: 10.1093/nar/gkae1142.

Community assessment of methods to deconvolve cellular composition from bulk gene expression.从批量基因表达中推断细胞成分的方法的社区评估。

Nat Commun. 2024 Aug 27;15(1):7362. doi: 10.1038/s41467-024-50618-0.

Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods.异质拟时间序列模拟可实现细胞类型去卷积方法的真实基准测试。

Genome Biol. 2024 Jul 1;25(1):169. doi: 10.1186/s13059-024-03292-w.

Challenges and perspectives in computational deconvolution of genomics data.计算基因组学数据去卷积的挑战与展望。

Nat Methods. 2024 Mar;21(3):391-400. doi: 10.1038/s41592-023-02166-6. Epub 2024 Feb 19.

Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets.使用单细胞 RNA 测序数据集对具有不同细胞大小的异质组织进行计算去卷积所面临的挑战和机遇。

Genome Biol. 2023 Dec 14;24(1):288. doi: 10.1186/s13059-023-03123-4.

Benchmarking strategies for cross-species integration of single-cell RNA sequencing data.用于单细胞 RNA 测序数据跨物种整合的基准测试策略。

Nat Commun. 2023 Oct 14;14(1):6495. doi: 10.1038/s41467-023-41855-w.

Effective methods for bulk RNA-seq deconvolution using scnRNA-seq transcriptomes.使用 scnRNA-seq 转录组进行批量 RNA-seq 去卷积的有效方法。

Genome Biol. 2023 Aug 1;24(1):177. doi: 10.1186/s13059-023-03016-6.

Comparative Analysis of Cell Mixtures Deconvolution and Gene Signatures Generated for Blood, Immune and Cancer Cells.细胞混合物去卷积与血液、免疫和癌细胞生成的基因特征的比较分析。

Int J Mol Sci. 2023 Jun 28;24(13):10765. doi: 10.3390/ijms241310765.

SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition.SimBu：具有可变细胞类型组成的批量 RNA-seq 数据的偏差感知模拟。

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii141-ii147. doi: 10.1093/bioinformatics/btac499.

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.智慧人图谱：人类多器官单细胞转录组图谱。

Science. 2022 May 13;376(6594):eabl4896. doi: 10.1126/science.abl4896.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

CATD：一种用于跨组织选择细胞类型反卷积方法的可重复流程。

CATD: a reproducible pipeline for selecting cell-type deconvolution methods across tissues.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现方式

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献