Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
Center of Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
Nat Commun. 2022 Dec 13;13(1):7705. doi: 10.1038/s41467-022-35031-9.
Single-cell multimodal sequencing technologies are developed to simultaneously profile different modalities of data in the same cell. It provides a unique opportunity to jointly analyze multimodal data at the single-cell level for the identification of distinct cell types. A correct clustering result is essential for the downstream complex biological functional studies. However, combining different data sources for clustering analysis of single-cell multimodal data remains a statistical and computational challenge. Here, we develop a novel multimodal deep learning method, scMDC, for single-cell multi-omics data clustering analysis. scMDC is an end-to-end deep model that explicitly characterizes different data sources and jointly learns latent features of deep embedding for clustering analysis. Extensive simulation and real-data experiments reveal that scMDC outperforms existing single-cell single-modal and multimodal clustering methods on different single-cell multimodal datasets. The linear scalability of running time makes scMDC a promising method for analyzing large multimodal datasets.
单细胞多模态测序技术旨在同时对同一细胞中的不同模式的数据进行分析。它为在单细胞水平上联合分析多模态数据提供了独特的机会,从而可以识别不同的细胞类型。正确的聚类结果对于下游复杂的生物学功能研究至关重要。然而,对于单细胞多模态数据的聚类分析,整合不同的数据来源仍然是一个统计和计算上的挑战。在这里,我们开发了一种新的多模态深度学习方法 scMDC,用于单细胞多组学数据的聚类分析。scMDC 是一个端到端的深度模型,它明确地刻画了不同的数据来源,并联合学习了深层嵌入的潜在特征,以进行聚类分析。广泛的模拟和真实数据实验表明,scMDC 在不同的单细胞多模态数据集上优于现有的单细胞单模态和多模态聚类方法。运行时间的线性可扩展性使 scMDC 成为分析大型多模态数据集的一种很有前途的方法。