The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
Nat Commun. 2019 Nov 15;10(1):5188. doi: 10.1038/s41467-019-13099-0.
Allele-specific expression (ASE) at single-cell resolution is a critical tool for understanding the stochastic and dynamic features of gene expression. However, low read coverage and high biological variability present challenges for analyzing ASE. We demonstrate that discarding multi-mapping reads leads to higher variability in estimates of allelic proportions, an increased frequency of sampling zeros, and can lead to spurious findings of dynamic and monoallelic gene expression. Here, we report a method for ASE analysis from single-cell RNA-Seq data that accurately classifies allelic expression states and improves estimation of allelic proportions by pooling information across cells. We further demonstrate that combining information across cells using a hierarchical mixture model reduces sampling variability without sacrificing cell-to-cell heterogeneity. We applied our approach to re-evaluate the statistical independence of allelic bursting and track changes in the allele-specific expression patterns of cells sampled over a developmental time course.
单细胞分辨率的等位基因特异性表达 (ASE) 是理解基因表达的随机和动态特征的关键工具。然而,低读长覆盖和高生物学变异性给 ASE 的分析带来了挑战。我们证明,丢弃多映射读长会导致等位比例估计的变异性增加,采样零的频率增加,并可能导致动态和单等位基因表达的虚假发现。在这里,我们报告了一种从单细胞 RNA-Seq 数据中进行 ASE 分析的方法,该方法通过跨细胞池信息准确分类等位基因表达状态并提高等位基因比例的估计。我们进一步证明,使用层次混合模型跨细胞组合信息可以减少采样变异性,而不会牺牲细胞间的异质性。我们应用我们的方法重新评估等位基因爆发的统计独立性,并跟踪发育时间过程中采样的细胞中等位基因特异性表达模式的变化。