Department of Human Genetics, McGill University, Montreal, QC, H3A 0C7, Canada.
Victor P. Dahdaleh Institute of Genomic Medicine, Montreal, QC, H3A 0G1, Canada.
Nat Commun. 2024 Aug 3;15(1):6573. doi: 10.1038/s41467-024-50963-0.
Single-cell analysis across multiple samples and conditions requires quantitative modeling of the interplay between the continuum of cell states and the technical and biological sources of sample-to-sample variability. We introduce GEDI, a generative model that identifies latent space variations in multi-sample, multi-condition single-cell datasets and attributes them to sample-level covariates. GEDI enables cross-sample cell state mapping on par with state-of-the-art integration methods, cluster-free differential gene expression analysis along the continuum of cell states, and machine learning-based prediction of sample characteristics from single-cell data. GEDI can also incorporate gene-level prior knowledge to infer pathway and regulatory network activities in single cells. Finally, GEDI extends all these concepts to previously unexplored modalities that require joint consideration of dual measurements, such as the joint analysis of exon inclusion/exclusion reads to model alternative cassette exon splicing, or spliced/unspliced reads to model the mRNA stability landscapes of single cells.
单细胞分析跨越多个样本和条件,需要定量建模细胞状态连续体与样本间变异性的技术和生物学来源之间的相互作用。我们引入了 GEDI,这是一种生成模型,可识别多样本、多条件单细胞数据集的潜在空间变化,并将其归因于样本水平的协变量。GEDI 能够实现跨样本细胞状态映射,与最先进的集成方法相媲美,沿着细胞状态连续体进行无聚类的差异基因表达分析,以及基于机器学习的从单细胞数据预测样本特征。GEDI 还可以结合基因水平的先验知识来推断单细胞中的通路和调控网络活性。最后,GEDI 将所有这些概念扩展到以前未探索的模态,这些模态需要联合考虑双重测量,例如联合分析外显子包含/排除读数以模拟替代盒式外显子剪接,或剪接/未剪接读数以模拟单个细胞的 mRNA 稳定性图谱。