Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China.
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China.
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa287.
Simultaneous profiling transcriptomic and chromatin accessibility information in the same individual cells offers an unprecedented resolution to understand cell states. However, computationally effective methods for the integration of these inherent sparse and heterogeneous data are lacking. Here, we present a single-cell multimodal variational autoencoder model, which combines three types of joint-learning strategies with a probabilistic Gaussian Mixture Model to learn the joint latent features that accurately represent these multilayer profiles. Studies on both simulated datasets and real datasets demonstrate that it has more preferable capability (i) dissecting cellular heterogeneity in the joint-learning space, (ii) denoising and imputing data and (iii) constructing the association between multilayer omics data, which can be used for understanding transcriptional regulatory mechanisms.
在同一单个细胞中同时分析转录组和染色质可及性信息,为理解细胞状态提供了前所未有的分辨率。然而,这些固有稀疏和异质数据的集成在计算上还缺乏有效的方法。在这里,我们提出了一种单细胞多模态变分自动编码器模型,该模型结合了三种联合学习策略和概率高斯混合模型,以学习准确表示这些多层谱的联合潜在特征。对模拟数据集和真实数据集的研究表明,它具有更好的能力:(i)在联合学习空间中剖析细胞异质性,(ii)对数据进行去噪和插补,以及(iii)构建多层组学数据之间的关联,这可用于理解转录调控机制。