Department of Oncology UNIL CHUV, Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland.
Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
Methods Mol Biol. 2020;2120:233-248. doi: 10.1007/978-1-0716-0327-7_17.
Gene expression profiling is nowadays routinely performed on clinically relevant samples (e.g., from tumor specimens). Such measurements are often obtained from bulk samples containing a mixture of cell types. Knowledge of the proportions of these cell types is crucial as they are key determinants of the disease evolution and response to treatment. Moreover, heterogeneity in cell type proportions across samples is an important confounding factor in downstream analyses.Many tools have been developed to estimate the proportion of the different cell types from bulk gene expression data. Here, we provide guidelines and examples on how to use these tools, with a special focus on our recent computational method EPIC (Estimating the Proportions of Immune and Cancer cells). EPIC includes RNA-seq-based gene expression reference profiles from immune cells and other nonmalignant cell types found in tumors. EPIC can additionally manage user-defined gene expression reference profiles. Some unique features of EPIC include the ability to account for an uncharacterized cell type, the introduction of a renormalization step to account for different mRNA content in each cell type, and the use of single-cell RNA-seq data to derive biologically relevant reference gene expression profiles. EPIC is available as a web application ( http://epic.gfellerlab.org ) and as an R-package ( https://github.com/GfellerLab/EPIC ).
目前,基因表达谱分析通常在临床相关样本(例如肿瘤标本)上进行。这些测量通常来自包含多种细胞类型混合物的批量样本。这些细胞类型的比例是疾病进展和治疗反应的关键决定因素,因此了解这些比例至关重要。此外,样本中细胞类型比例的异质性是下游分析的一个重要混杂因素。已经开发了许多工具来从批量基因表达数据中估计不同细胞类型的比例。在这里,我们提供了使用这些工具的指南和示例,特别关注我们最近的计算方法 EPIC(Estimating the Proportions of Immune and Cancer cells)。EPIC 包括来自免疫细胞和肿瘤中其他非恶性细胞类型的基于 RNA-seq 的基因表达参考图谱。EPIC 还可以管理用户定义的基因表达参考图谱。EPIC 的一些独特功能包括能够解释未表征的细胞类型、引入重新归一化步骤以解释每个细胞类型中不同的 mRNA 含量,以及使用单细胞 RNA-seq 数据来推导生物学上相关的参考基因表达图谱。EPIC 可作为网络应用程序(http://epic.gfellerlab.org)和 R 包(https://github.com/GfellerLab/EPIC)使用。