1] Helmholtz Zentrum München-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany. [2] European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
1] European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. [2] Wellcome Trust Sanger Institute, Hinxton, UK.
Nat Biotechnol. 2015 Feb;33(2):155-60. doi: 10.1038/nbt.3102. Epub 2015 Jan 19.
Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.
最近的技术发展使得能够以无偏倚的方式检测数百个细胞的转录组,从而有可能发现新的细胞亚群。然而,潜在混杂因素(如细胞周期)对基因表达异质性的影响,以及对稳健地识别亚群的能力的影响仍不清楚。我们提出并验证了一种使用潜在变量模型来解释这些隐藏因素的计算方法。我们表明,我们的单细胞潜在变量模型(scLVM)允许识别否则无法检测到的细胞亚群,这些亚群对应于初始 T 细胞分化为 T 辅助 2 细胞过程中的不同阶段。我们的方法不仅可用于识别细胞亚群,还可用于剖析单细胞转录组中基因表达异质性的不同来源。