线性和非线性联合嵌入方法在体和单细胞多组学中的深入比较。

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics.

机构信息

Delft Bioinformatics Lab, Delft University of Technology, Street, Postcode, State, Country.

Department of Medical Oncology, Erasmus University Medical Center, Street, Postcode, State, Country.

出版信息

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad416.

DOI:10.1093/bib/bbad416

PMID:38018908

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10685331/

Abstract

Multi-omic analyses are necessary to understand the complex biological processes taking place at the tissue and cell level, but also to make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, but recently there has been a rise in the popularity of neural architectures that embed paired -omics into the same non-linear manifold. This work describes a head-to-head comparison of linear and non-linear joint embedding methods using both bulk and single-cell multi-modal datasets. We found that non-linear methods have a clear advantage with respect to linear ones for missing modality imputation. Performance comparisons in the downstream tasks of survival analysis for bulk tumor data and cell type classification for single-cell data lead to the following insights: First, concatenating the principal components of each modality is a competitive baseline and hard to beat if all modalities are available at test time. However, if we only have one modality available at test time, training a predictive model on the joint space of that modality can lead to performance improvements with respect to just using the unimodal principal components. Second, -omic profiles imputed by neural joint embedding methods are realistic enough to be used by a classifier trained on real data with limited performance drops. Taken together, our comparisons give hints to which joint embedding to use for which downstream task. Overall, product-of-experts performed well in most tasks and was reasonably fast, while early integration (concatenation) of modalities did quite poorly.

摘要

多组学分析对于理解组织和细胞水平上发生的复杂生物学过程是必要的，同时也可以对例如疾病结果进行可靠的预测。存在几种线性方法，这些方法可以使用每个样本的配对信息创建联合嵌入，但最近，将配对的组学嵌入到同一非线性流形中的神经架构的受欢迎程度有所上升。本研究使用批量和单细胞多模态数据集对头对头比较线性和非线性联合嵌入方法。我们发现，对于缺失模态插补，非线性方法相对于线性方法具有明显优势。在批量肿瘤数据的生存分析下游任务和单细胞数据的细胞类型分类的性能比较中，得出以下见解：首先，如果在测试时所有模态都可用，那么将每个模态的主成分串联起来是一种具有竞争力的基线，并且很难被击败。然而，如果我们在测试时只有一个模态可用，那么在该模态的联合空间上训练预测模型可以提高性能，而不仅仅是使用单模态主成分。其次，通过神经联合嵌入方法推断的组学谱足够真实，可以被有限性能下降的基于真实数据训练的分类器使用。总之，我们的比较为下游任务提供了使用哪种联合嵌入的提示。总体而言，专家乘积在大多数任务中表现良好，速度也相当快，而模态的早期集成（串联）表现得相当差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1775/10685331/87d9f3fb6663/bbad416f1.jpg

相似文献

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics.线性和非线性联合嵌入方法在体和单细胞多组学中的深入比较。

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad416.

nipalsMCIA: Flexible Multi-Block Dimensionality Reduction in R via Non-linear Iterative Partial Least Squares.nipalsMCIA：通过非线性迭代偏最小二乘法在R中实现灵活的多块降维

bioRxiv. 2024 Jun 10:2024.06.07.597819. doi: 10.1101/2024.06.07.597819.

Integration of multi-omics data using adaptive graph learning and attention mechanism for patient classification and biomarker identification.利用自适应图学习和注意力机制整合多组学数据，用于患者分类和生物标志物识别。

Comput Biol Med. 2023 Sep;164:107303. doi: 10.1016/j.compbiomed.2023.107303. Epub 2023 Aug 2.

MoNETA: MultiOmics Network Embedding for SubType Analysis.MoNETA：用于亚型分析的多组学网络嵌入

NAR Genom Bioinform. 2024 Oct 16;6(4):lqae141. doi: 10.1093/nargab/lqae141. eCollection 2024 Sep.

Self-omics: A Self-supervised Learning Framework for Multi-omics Cancer Data.自组学：一种用于多组学生物标志物癌症数据的自监督学习框架。

Pac Symp Biocomput. 2023;28:263-274.

CLCLSA: Cross-omics linked embedding with contrastive learning and self attention for integration with incomplete multi-omics data.CLCLSA：基于对比学习和自注意力机制的跨组学关联嵌入方法，用于整合不完整的多组学数据。

Comput Biol Med. 2024 Mar;170:108058. doi: 10.1016/j.compbiomed.2024.108058. Epub 2024 Jan 28.

Single-cell multi-omics topic embedding reveals cell-type-specific and COVID-19 severity-related immune signatures.单细胞多组学主题嵌入揭示了与细胞类型特异性和 COVID-19 严重程度相关的免疫特征。

Cell Rep Methods. 2023 Aug 18;3(8):100563. doi: 10.1016/j.crmeth.2023.100563. eCollection 2023 Aug 28.

Uncertainty-aware dynamic integration for multi-omics classification of tumors.基于不确定性感知的动态集成方法在肿瘤多组学分类中的应用。

J Cancer Res Clin Oncol. 2023 Jul;149(7):3301-3312. doi: 10.1007/s00432-022-04219-3. Epub 2022 Aug 4.

A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction.神经网络架构在多组学药物反应预测中潜在表示的公平实验比较。

BMC Bioinformatics. 2023 Feb 14;24(1):45. doi: 10.1186/s12859-023-05166-7.

Multi-omics integration method based on attention deep learning network for biomedical data classification.基于注意力深度学习网络的多组学整合方法用于生物医学数据分类

Comput Methods Programs Biomed. 2023 Apr;231:107377. doi: 10.1016/j.cmpb.2023.107377. Epub 2023 Jan 27.

引用本文的文献

GAUDI: interpretable multi-omics integration with UMAP embeddings and density-based clustering.GAUDI：通过UMAP嵌入和基于密度的聚类实现可解释的多组学整合。

Nat Commun. 2025 Jul 1;16(1):5771. doi: 10.1038/s41467-025-60822-1.

Deep learning in single-cell and spatial transcriptomics data analysis: advances and challenges from a data science perspective.从数据科学视角看深度学习在单细胞和空间转录组学数据分析中的进展与挑战

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf136.

Interpretable multi-omics integration with UMAP embeddings and density-based clustering.基于UMAP嵌入和密度聚类的可解释多组学整合

bioRxiv. 2024 Oct 11:2024.10.07.617035. doi: 10.1101/2024.10.07.617035.

Recover then aggregate: unified cross-modal deep clustering with global structural information for single-cell data.恢复然后聚合：利用单细胞数据的全局结构信息进行统一的跨模态深度聚类。

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae485.

Panpipes: a pipeline for multiomic single-cell and spatial transcriptomic data analysis.多组学单细胞和空间转录组数据分析的 Panpipes 管道。

Genome Biol. 2024 Jul 8;25(1):181. doi: 10.1186/s13059-024-03322-7.

本文引用的文献

Benchmarking variational AutoEncoders on cancer transcriptomics data.基于癌症转录组学数据的变分自编码器基准测试。

PLoS One. 2023 Oct 5;18(10):e0292126. doi: 10.1371/journal.pone.0292126. eCollection 2023.

A unified computational framework for single-cell data integration with optimal transport.单细胞数据整合的最优传输统一计算框架。

Nat Commun. 2022 Dec 1;13(1):7419. doi: 10.1038/s41467-022-35094-8.

The performance of deep generative models for learning joint embeddings of single-cell multi-omics data.用于学习单细胞多组学数据联合嵌入的深度生成模型的性能。

Front Mol Biosci. 2022 Oct 26;9:962644. doi: 10.3389/fmolb.2022.962644. eCollection 2022.

ISSAAC-seq enables sensitive and flexible multimodal profiling of chromatin accessibility and gene expression in single cells.ISSAAC-seq 能够灵敏且灵活地对单细胞中的染色质可及性和基因表达进行多模式分析。

Nat Methods. 2022 Oct;19(10):1243-1249. doi: 10.1038/s41592-022-01601-4. Epub 2022 Sep 15.

Into the multiverse: advances in single-cell multiomic profiling.走进多元宇宙：单细胞多组学分析的进展

Trends Genet. 2022 Aug;38(8):831-843. doi: 10.1016/j.tig.2022.03.015. Epub 2022 May 8.

A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data.专家混合深度生成模型，用于单细胞多组学数据的综合分析。

Cell Rep Methods. 2021 Sep 15;1(5):100071. doi: 10.1016/j.crmeth.2021.100071. eCollection 2021 Sep 27.

A Python library for probabilistic analysis of single-cell omics data.一个用于单细胞组学数据概率分析的Python库。

Nat Biotechnol. 2022 Feb;40(2):163-166. doi: 10.1038/s41587-021-01206-w.

Accurate and fast cell marker gene identification with COSG.使用COSG准确快速地鉴定细胞标记基因。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab579.

Smart-RRBS for single-cell methylome and transcriptome analysis.用于单细胞甲基化组和转录组分析的 Smart-RRBS

Nat Protoc. 2021 Aug;16(8):4004-4030. doi: 10.1038/s41596-021-00571-9. Epub 2021 Jul 9.

Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。

Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

线性和非线性联合嵌入方法在体和单细胞多组学中的深入比较。

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献