Department of Informatics, Systems and Communication, Università degli Studi di Milano-Bicocca, Milan, 20125, Italy.
Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy.
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae300.
Single-cell profiling has become a common practice to investigate the complexity of tissues, organs, and organisms. Recent technological advances are expanding our capabilities to profile various molecular layers beyond the transcriptome such as, but not limited to, the genome, the epigenome, and the proteome. Depending on the experimental procedure, these data can be obtained from separate assays or the very same cells. Yet, integration of more than two assays is currently not supported by the majority of the computational frameworks avaiable.
We here propose a Multi-Omic data integration framework based on Wasserstein Generative Adversarial Networks suitable for the analysis of paired or unpaired data with a high number of modalities (>2). At the core of our strategy is a single network trained on all modalities together, limiting the computational burden when many molecular layers are evaluated.
Source code of our framework is available at https://github.com/vgiansanti/MOWGAN.
单细胞分析已成为研究组织、器官和生物体复杂性的常见方法。最近的技术进步正在扩展我们的能力,不仅可以分析基因组、表观基因组和蛋白质组等转录组以外的各种分子层,而且还可以分析转录组以外的各种分子层。根据实验程序,这些数据可以从单独的测定中获得,也可以从同一细胞中获得。然而,目前大多数可用的计算框架都不支持整合两个以上的测定。
我们在这里提出了一个基于 Wasserstein 生成对抗网络的多组学数据整合框架,适用于分析具有大量模态(>2)的配对或非配对数据。我们策略的核心是在所有模态上一起训练的单个网络,当评估许多分子层时,限制了计算负担。
我们的框架的源代码可在 https://github.com/vgiansanti/MOWGAN 获得。