School of Software, Shandong University, Jinan 250101, Shandong, China.
Joint SDU-NTU Centre for Artificial Intelligence Research, Shandong University, Jinan 250101, China.
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad133.
The integration of single-cell multi-omics data can uncover the underlying regulatory basis of diverse cell types and states. However, contemporary methods disregard the omics individuality, and the high noise, sparsity, and heterogeneity of single-cell data also impact the fusion effect. Furthermore, available single-cell clustering methods only focus on the cell type clustering, which cannot mine the alternative clustering to comprehensively analyze cells.
We propose a single-cell data fusion based multiple clustering (scMCs) approach that can jointly model single-cell transcriptomics and epigenetic data, and explore multiple different clusterings. scMCs first mines the omics-specific and cross-omics consistent representations, then fuses them into a co-embedding representation, which can dissect cellular heterogeneity and impute data. To discover the potential alternative clustering embedded in multi-omics, scMCs projects the co-embedding representation into different salient subspaces. Meanwhile, it reduces the redundancy between subspaces to enhance the diversity of alternative clusterings and optimizes the cluster centers in each subspace to boost the quality of corresponding clustering. Unlike single clustering, these alternative clusterings provide additional perspectives for understanding complex genetic information, such as cell types and states. Experimental results show that scMCs can effectively identify subcellular types, impute dropout events, and uncover diverse cell characteristics by giving different but meaningful clusterings.
The code is available at www.sdu-idea.cn/codes.php?name=scMCs.
单细胞多组学数据的整合可以揭示不同细胞类型和状态的潜在调控基础。然而,当代方法忽略了组学的个体性,单细胞数据的高噪声、稀疏性和异质性也会影响融合效果。此外,现有的单细胞聚类方法仅关注细胞类型聚类,无法挖掘替代聚类以全面分析细胞。
我们提出了一种基于单细胞数据融合的多聚类(scMCs)方法,该方法可以联合建模单细胞转录组学和表观遗传数据,并探索多种不同的聚类。scMCs 首先挖掘特定于组学和跨组学一致的表示,然后将它们融合到共同嵌入表示中,该表示可以剖析细胞异质性并进行数据推断。为了发现多组学中潜在的替代聚类,scMCs 将共同嵌入表示投影到不同的显著子空间中。同时,它减少了子空间之间的冗余,增强了替代聚类的多样性,并优化了每个子空间中的聚类中心,以提高相应聚类的质量。与单一聚类不同,这些替代聚类为理解复杂的遗传信息,如细胞类型和状态,提供了额外的视角。实验结果表明,scMCs 可以通过提供不同但有意义的聚类来有效地识别亚细胞类型、推断缺失事件,并揭示多样化的细胞特征。
代码可在 www.sdu-idea.cn/codes.php?name=scMCs 获得。