Center for Computational Biology, Beijing Institute of Basic Medical Sciences, Beijing, China.
College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China.
Nat Biotechnol. 2024 Oct;42(10):1594-1605. doi: 10.1038/s41587-023-02040-y. Epub 2024 Jan 23.
Integrating single-cell datasets produced by multiple omics technologies is essential for defining cellular heterogeneity. Mosaic integration, in which different datasets share only some of the measured modalities, poses major challenges, particularly regarding modality alignment and batch effect removal. Here, we present a deep probabilistic framework for the mosaic integration and knowledge transfer (MIDAS) of single-cell multimodal data. MIDAS simultaneously achieves dimensionality reduction, imputation and batch correction of mosaic data by using self-supervised modality alignment and information-theoretic latent disentanglement. We demonstrate its superiority to 19 other methods and reliability by evaluating its performance in trimodal and mosaic integration tasks. We also constructed a single-cell trimodal atlas of human peripheral blood mononuclear cells and tailored transfer learning and reciprocal reference mapping schemes to enable flexible and accurate knowledge transfer from the atlas to new data. Applications in mosaic integration, pseudotime analysis and cross-tissue knowledge transfer on bone marrow mosaic datasets demonstrate the versatility and superiority of MIDAS. MIDAS is available at https://github.com/labomics/midas .
整合由多种组学技术产生的单细胞数据集对于定义细胞异质性至关重要。镶嵌式整合(mosaic integration)中,不同的数据集仅共享部分测量模式,这带来了重大挑战,尤其是在模式对齐和批次效应去除方面。在这里,我们提出了一种用于单细胞多模态数据的镶嵌式整合和知识转移(MIDAS)的深度概率框架。MIDAS 通过使用自监督模式对齐和信息论潜在解缠,同时实现了镶嵌数据的降维和填补以及批次校正。我们通过在三模态和镶嵌式整合任务中评估其性能,证明了它优于其他 19 种方法的优越性和可靠性。我们还构建了一个人类外周血单核细胞的单细胞三模态图谱,并定制了转移学习和相互参考映射方案,以实现从图谱到新数据的灵活和准确的知识转移。在骨髓镶嵌数据集上的镶嵌式整合、拟时分析和跨组织知识转移中的应用表明了 MIDAS 的多功能性和优越性。MIDAS 可在 https://github.com/labomics/midas 上获得。