LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China.
Nat Commun. 2022 Dec 1;13(1):7419. doi: 10.1038/s41467-022-35094-8.
Single-cell data integration can provide a comprehensive molecular view of cells. However, how to integrate heterogeneous single-cell multi-omics as well as spatially resolved transcriptomic data remains a major challenge. Here we introduce uniPort, a unified single-cell data integration framework that combines a coupled variational autoencoder (coupled-VAE) and minibatch unbalanced optimal transport (Minibatch-UOT). It leverages both highly variable common and dataset-specific genes for integration to handle the heterogeneity across datasets, and it is scalable to large-scale datasets. uniPort jointly embeds heterogeneous single-cell multi-omics datasets into a shared latent space. It can further construct a reference atlas for gene imputation across datasets. Meanwhile, uniPort provides a flexible label transfer framework to deconvolute heterogeneous spatial transcriptomic data using an optimal transport plan, instead of embedding latent space. We demonstrate the capability of uniPort by applying it to integrate a variety of datasets, including single-cell transcriptomics, chromatin accessibility, and spatially resolved transcriptomic data.
单细胞数据整合可以提供细胞的全面分子视图。然而,如何整合异质的单细胞多组学以及空间分辨转录组数据仍然是一个主要挑战。在这里,我们介绍了 uniPort,这是一个统一的单细胞数据整合框架,它结合了耦合变分自动编码器(coupled-VAE)和小批量不平衡最优传输(Minibatch-UOT)。它利用高度可变的常见基因和数据集特定基因进行整合,以处理数据集之间的异质性,并且可扩展到大规模数据集。uniPort 将异质的单细胞多组学数据集联合嵌入到一个共享的潜在空间中。它可以进一步构建跨数据集的基因插补参考图谱。同时,uniPort 提供了一个灵活的标签转移框架,使用最优传输方案对异质的空间转录组数据进行去卷积,而不是嵌入潜在空间。我们通过将其应用于整合各种数据集来展示 uniPort 的能力,包括单细胞转录组学、染色质可及性和空间分辨转录组学数据。