Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, CA 90095, USA.
Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA.
Nucleic Acids Res. 2021 May 7;49(8):e48. doi: 10.1093/nar/gkab031.
Targeted mRNA expression panels, measuring up to 800 genes, are used in academic and clinical settings due to low cost and high sensitivity for archived samples. Most samples assayed on targeted panels originate from bulk tissue comprised of many cell types, and cell-type heterogeneity confounds biological signals. Reference-free methods are used when cell-type-specific expression references are unavailable, but limited feature spaces render implementation challenging in targeted panels. Here, we present DeCompress, a semi-reference-free deconvolution method for targeted panels. DeCompress leverages a reference RNA-seq or microarray dataset from similar tissue to expand the feature space of targeted panels using compressed sensing. Ensemble reference-free deconvolution is performed on this artificially expanded dataset to estimate cell-type proportions and gene signatures. In simulated mixtures, four public cell line mixtures, and a targeted panel (1199 samples; 406 genes) from the Carolina Breast Cancer Study, DeCompress recapitulates cell-type proportions with less error than reference-free methods and finds biologically relevant compartments. We integrate compartment estimates into cis-eQTL mapping in breast cancer, identifying a tumor-specific cis-eQTL for CCR3 (C-C Motif Chemokine Receptor 3) at a risk locus. DeCompress improves upon reference-free methods without requiring expression profiles from pure cell populations, with applications in genomic analyses and clinical settings.
靶向 mRNA 表达谱panel,可检测多达 800 个基因,由于成本低、对存档样本灵敏度高,因此在学术和临床环境中得到应用。大多数靶向panel 检测的样本来源于由多种细胞类型组成的批量组织,而细胞类型异质性会干扰生物信号。在没有细胞类型特异性表达参考的情况下,会使用无参考方法,但靶向panel 的特征空间有限,使得实施具有挑战性。在这里,我们提出了 DeCompress,这是一种针对靶向panel 的半无参考去卷积方法。DeCompress 利用来自相似组织的参考 RNA-seq 或微阵列数据集,使用压缩感知来扩展靶向panel 的特征空间。在此人工扩展数据集上进行基于集合的无参考去卷积,以估计细胞类型比例和基因特征。在模拟混合物、四个公开的细胞系混合物以及来自 Carolina Breast Cancer Study 的靶向panel(1199 个样本;406 个基因)中,DeCompress 比无参考方法更准确地再现了细胞类型比例,并发现了生物学上相关的细胞区室。我们将区室估计值整合到乳腺癌中的 cis-eQTL 映射中,鉴定出一个位于风险位点的 CCR3(C-C 基序趋化因子受体 3)的肿瘤特异性 cis-eQTL。DeCompress 改进了无参考方法,而无需来自纯细胞群体的表达谱,在基因组分析和临床环境中有应用。