用于批量 RNA 测序数据的整体且稳健的去卷积方案，该方案整合了多个单细胞参考集和先验生物学知识。

Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge.

机构信息

Department of Epidemiology and Public Health, Division of Biostatistics and Bioinformatics, University of Maryland School of Medicine, Baltimore, MD 21201, USA.

Department of Neurosurgery, University of Maryland School of Medicine, Baltimore, MD 21201, USA.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4530-4536. doi: 10.1093/bioinformatics/btac563.

DOI:10.1093/bioinformatics/btac563

PMID:35980155

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9525013/

Abstract

MOTIVATION

Cell-type deconvolution of bulk tissue RNA sequencing (RNA-seq) data is an important step toward understanding the variations in cell-type composition among disease conditions. Owing to recent advances in single-cell RNA sequencing (scRNA-seq) and the availability of large amounts of bulk RNA-seq data in disease-relevant tissues, various deconvolution methods have been developed. However, the performance of existing methods heavily relies on the quality of information provided by external data sources, such as the selection of scRNA-seq data as a reference and prior biological information.

RESULTS

We present the Integrated and Robust Deconvolution (InteRD) algorithm to infer cell-type proportions from target bulk RNA-seq data. Owing to the innovative use of penalized regression with a new evaluation criterion for deconvolution, InteRD has three primary advantages. First, it is able to effectively integrate deconvolution results from multiple scRNA-seq datasets. Second, InteRD calibrates estimates from reference-based deconvolution by taking into account extra biological information as priors. Third, the proposed algorithm is robust to inaccurate external information imposed in the deconvolution system. Extensive numerical evaluations and real-data applications demonstrate that InteRD yields more accurate and robust cell-type proportion estimates that agree well with known biology.

AVAILABILITY AND IMPLEMENTATION

The proposed InteRD framework is implemented in R and the package is available at https://cran.r-project.org/web/packages/InteRD/index.html.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

将批量组织 RNA 测序 (RNA-seq) 数据进行细胞类型分解是理解疾病状态下细胞类型组成变化的重要步骤。由于单细胞 RNA 测序 (scRNA-seq) 的最新进展以及在疾病相关组织中大量获得批量 RNA-seq 数据，已经开发了各种去卷积方法。然而，现有方法的性能在很大程度上依赖于外部数据源提供的信息的质量，例如选择 scRNA-seq 数据作为参考和先验生物学信息。

结果

我们提出了集成和稳健去卷积（InteRD）算法，以便从目标批量 RNA-seq 数据中推断细胞类型比例。由于创新性地使用了具有新的去卷积评估标准的惩罚回归，InteRD 具有三个主要优点。首先，它能够有效地整合来自多个 scRNA-seq 数据集的去卷积结果。其次，InteRD 通过将额外的生物学信息作为先验来校准基于参考的去卷积的估计值。第三，所提出的算法对去卷积系统中引入的不准确外部信息具有鲁棒性。广泛的数值评估和实际数据应用表明，InteRD 产生的细胞类型比例估计更准确、更稳健，与已知生物学一致。

可用性和实现

所提出的 InteRD 框架在 R 中实现，该软件包可在 https://cran.r-project.org/web/packages/InteRD/index.html 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge.用于批量 RNA 测序数据的整体且稳健的去卷积方案，该方案整合了多个单细胞参考集和先验生物学知识。

Bioinformatics. 2022 Sep 30;38(19):4530-4536. doi: 10.1093/bioinformatics/btac563.

MuSiC2: cell-type deconvolution for multi-condition bulk RNA-seq data.MuSiC2：用于多条件批量 RNA-seq 数据的细胞类型去卷积。

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac430.

DiSC: a statistical tool for fast differential expression analysis of individual-level single-cell RNA-seq data.DiSC：一种用于个体水平单细胞RNA测序数据快速差异表达分析的统计工具。

Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf327.

Systematic determination of the mitochondrial proportion in human and mice tissues for single-cell RNA-sequencing data quality control.系统确定人类和小鼠组织中线粒体比例，用于单细胞 RNA 测序数据质量控制。

Bioinformatics. 2021 May 17;37(7):963-967. doi: 10.1093/bioinformatics/btaa751.

MuDCoD: multi-subject community detection in personalized dynamic gene networks from single-cell RNA sequencing.MuDCoD：单细胞 RNA 测序中个性化动态基因网络的多主体社区检测。

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad592.

Detecting cell-type-specific allelic expression imbalance by integrative analysis of bulk and single-cell RNA sequencing data.通过整合分析批量和单细胞 RNA 测序数据检测细胞类型特异性等位基因表达失衡。

PLoS Genet. 2021 Mar 4;17(3):e1009080. doi: 10.1371/journal.pgen.1009080. eCollection 2021 Mar.

Next-generation deconvolution of the tumor microenvironment with omnideconv.使用omnideconv对肿瘤微环境进行下一代反卷积分析。

Methods Cell Biol. 2025;196:87-112. doi: 10.1016/bs.mcb.2025.01.003. Epub 2025 Feb 6.

ScInfeR: an efficient method for annotating cell types and sub-types in single-cell RNA-seq, ATAC-seq, and spatial omics.ScInfeR：一种用于在单细胞RNA测序、ATAC测序和空间组学中注释细胞类型和亚型的有效方法。

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf253.

Reference Vector-guided Evolutionary Algorithm for cluster analysis of single-cell transcriptomes.用于单细胞转录组聚类分析的参考向量引导进化算法

Comput Methods Programs Biomed. 2025 Sep;269:108873. doi: 10.1016/j.cmpb.2025.108873. Epub 2025 Jun 6.

stGNN: Spatially Informed Cell-Type Deconvolution Based on Deep Graph Learning and Statistical Modeling.stGNN：基于深度图学习和统计建模的空间信息细胞类型反卷积

Interdiscip Sci. 2025 Jun 26. doi: 10.1007/s12539-025-00728-0.

引用本文的文献

ReCIDE: robust estimation of cell type proportions by integrating single-reference-based deconvolutions.ReCIDE：通过整合基于单参考的去卷积来稳健估计细胞类型比例。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae422.

A novel Bayesian model for assessing intratumor heterogeneity of tumor infiltrating leukocytes with multi-region gene expression sequencing.一种用于通过多区域基因表达测序评估肿瘤浸润白细胞肿瘤内异质性的新型贝叶斯模型。

bioRxiv. 2023 Oct 29:2023.10.24.563820. doi: 10.1101/2023.10.24.563820.

A statistical framework to identify cell types whose genetically regulated proportions are associated with complex diseases.一种统计框架，用于识别与复杂疾病相关的遗传调控比例的细胞类型。

PLoS Genet. 2023 Jul 31;19(7):e1010825. doi: 10.1371/journal.pgen.1010825. eCollection 2023 Jul.

Loss of Krüppel-like factor 9 deregulates both physiological gene expression and development.Krüppel 样因子 9 的缺失会导致生理基因表达和发育的失调。

Sci Rep. 2023 Jul 28;13(1):12239. doi: 10.1038/s41598-023-39453-3.

本文引用的文献

ICeD-T Provides Accurate Estimates of Immune Cell Abundance in Tumor Samples by Allowing for Aberrant Gene Expression Patterns.ICeD-T通过考虑异常基因表达模式，为肿瘤样本中的免疫细胞丰度提供准确估计。

J Am Stat Assoc. 2020;115(531):1055-1065. doi: 10.1080/01621459.2019.1654874. Epub 2019 Sep 16.

APOE and TREM2 regulate amyloid-responsive microglia in Alzheimer's disease.载脂蛋白 E 和 TREM2 调节阿尔茨海默病中淀粉样蛋白反应性小胶质细胞。

Acta Neuropathol. 2020 Oct;140(4):477-493. doi: 10.1007/s00401-020-02200-3. Epub 2020 Aug 25.

Deep learning-based cell composition analysis from tissue expression profiles.基于深度学习的组织表达谱细胞成分分析

Sci Adv. 2020 Jul 22;6(30):eaba2619. doi: 10.1126/sciadv.aba2619. eCollection 2020 Jul.

Deciphering cellular transcriptional alterations in Alzheimer's disease brains.解析阿尔茨海默病大脑中的细胞转录变化。

Mol Neurodegener. 2020 Jul 13;15(1):38. doi: 10.1186/s13024-020-00392-6.

Robust partial reference-free cell composition estimation from tissue expression.从组织表达中稳健的无参考局部细胞成分估计

Bioinformatics. 2020 Jun 1;36(11):3431-3438. doi: 10.1093/bioinformatics/btaa184.

SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references.SCDC：通过多个单细胞 RNA 测序参考进行批量基因表达去卷积。

Brief Bioinform. 2021 Jan 18;22(1):416-427. doi: 10.1093/bib/bbz166.

Determining cell type abundance and expression from bulk tissues with digital cytometry.利用数字细胞术从组织样本中测定细胞类型丰度和表达。

Nat Biotechnol. 2019 Jul;37(7):773-782. doi: 10.1038/s41587-019-0114-2. Epub 2019 May 6.

Single-cell transcriptomic analysis of Alzheimer's disease.阿尔茨海默病的单细胞转录组分析。

Nature. 2019 Jun;570(7761):332-337. doi: 10.1038/s41586-019-1195-2. Epub 2019 May 1.

Bulk tissue cell type deconvolution with multi-subject single-cell expression reference.基于多主体单细胞表达参考的组织细胞类型去卷积。

Nat Commun. 2019 Jan 22;10(1):380. doi: 10.1038/s41467-018-08023-x.

A multi-omic atlas of the human frontal cortex for aging and Alzheimer's disease research.人类前额叶皮层衰老和阿尔茨海默病研究的多组学图谱

Sci Data. 2018 Aug 7;5:180142. doi: 10.1038/sdata.2018.142.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验