Institut Pasteur, Université Paris Cité, CNRS UMR 3738, Machine Learning for Integrative Genomics Group, F-75015, Paris, France.
Institut de Biologie de l'Ecole Normale Supérieure, CNRS, INSERM, Ecole Normale Supérieure, Université PSL, 75005, Paris, France.
Nat Commun. 2023 Nov 24;14(1):7711. doi: 10.1038/s41467-023-43019-2.
The profiling of multiple molecular layers from the same set of cells has recently become possible. There is thus a growing need for multi-view learning methods able to jointly analyze these data. We here present Multi-Omics Wasserstein inteGrative anaLysIs (Mowgli), a novel method for the integration of paired multi-omics data with any type and number of omics. Of note, Mowgli combines integrative Nonnegative Matrix Factorization and Optimal Transport, enhancing at the same time the clustering performance and interpretability of integrative Nonnegative Matrix Factorization. We apply Mowgli to multiple paired single-cell multi-omics data profiled with 10X Multiome, CITE-seq, and TEA-seq. Our in-depth benchmark demonstrates that Mowgli's performance is competitive with the state-of-the-art in cell clustering and superior to the state-of-the-art once considering biological interpretability. Mowgli is implemented as a Python package seamlessly integrated within the scverse ecosystem and it is available at http://github.com/cantinilab/mowgli .
最近,从同一组细胞中分析多个分子层成为可能。因此,人们越来越需要能够联合分析这些数据的多视图学习方法。我们在这里提出了 Multi-Omics Wasserstein inteGrative anaLysIs(Mowgli),这是一种用于整合具有任意类型和数量的组学的配对多组学数据的新方法。值得注意的是,Mowgli 将整合非负矩阵分解和最优传输相结合,同时提高了整合非负矩阵分解的聚类性能和可解释性。我们将 Mowgli 应用于多个经过 10X Multiome、CITE-seq 和 TEA-seq profiling 的单细胞多组学数据。我们的深入基准测试表明,Mowgli 的性能在细胞聚类方面具有竞争力,并且一旦考虑到生物学可解释性,就优于最先进的方法。Mowgli 作为一个 Python 包实现,无缝集成在 scverse 生态系统中,并可在 http://github.com/cantinilab/mowgli 获得。