Sicherman Jordan, Newton Dwight F, Pavlidis Paul, Sibille Etienne, Tripathy Shreejoy J
Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada.
Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada.
Front Mol Neurosci. 2021 Mar 3;14:637143. doi: 10.3389/fnmol.2021.637143. eCollection 2021.
Transcriptionally profiling minor cellular populations remains an ongoing challenge in molecular genomics. Single-cell RNA sequencing has provided valuable insights into a number of hypotheses, but practical and analytical challenges have limited its widespread adoption. A similar approach, which we term single-cell type RNA sequencing (sctRNA-seq), involves the enrichment and sequencing of a pool of cells, yielding cell type-level resolution transcriptomes. While this approach offers benefits in terms of mRNA sampling from targeted cell types, it is potentially affected by off-target contamination from surrounding cell types. Here, we leveraged single-cell sequencing datasets to apply a computational approach for estimating and controlling the amount of off-target cell type contamination in sctRNA-seq datasets. In datasets obtained using a number of technologies for cell purification, we found that most sctRNA-seq datasets tended to show some amount of off-target mRNA contamination from surrounding cells. However, using covariates for cellular contamination in downstream differential expression analyses increased the quality of our models for differential expression analysis in case/control comparisons and typically resulted in the discovery of more differentially expressed genes. In general, our method provides a flexible approach for detecting and controlling off-target cell type contamination in sctRNA-seq datasets.
对少数细胞群体进行转录谱分析仍然是分子基因组学中一项持续存在的挑战。单细胞RNA测序为许多假设提供了有价值的见解,但实际操作和分析方面的挑战限制了其广泛应用。一种类似的方法,我们称之为单细胞类型RNA测序(sctRNA-seq),涉及对一组细胞进行富集和测序,从而产生细胞类型水平分辨率的转录组。虽然这种方法在从目标细胞类型进行mRNA采样方面有优势,但它可能会受到周围细胞类型的脱靶污染影响。在这里,我们利用单细胞测序数据集应用一种计算方法来估计和控制sctRNA-seq数据集中脱靶细胞类型污染的量。在使用多种细胞纯化技术获得的数据集中,我们发现大多数sctRNA-seq数据集往往会显示出一定程度的来自周围细胞的脱靶mRNA污染。然而,在下游差异表达分析中使用细胞污染的协变量提高了我们在病例/对照比较中进行差异表达分析模型的质量,并且通常会导致发现更多差异表达基因。总体而言,我们的方法为检测和控制sctRNA-seq数据集中的脱靶细胞类型污染提供了一种灵活的方法。