用于多模态插补和嵌入的联合变分自编码器

Joint variational autoencoders for multimodal imputation and embedding.

作者信息

Kalafut Noah Cohen, Huang Xiang, Wang Daifeng

机构信息

Department of Computer Sciences, Wisconsin, US.

Waisman Center, University of Wisconsin-Madison, Wisconsin, US.

出版信息

Nat Mach Intell. 2023 Jun;5(6):631-642. doi: 10.1038/s42256-023-00663-z. Epub 2023 May 29.

DOI:10.1038/s42256-023-00663-z

PMID:39175596

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11340721/

Abstract

Single-cell multimodal datasets have measured various characteristics of individual cells, enabling a deep understanding of cellular and molecular mechanisms. However, multimodal data generation remains costly and challenging, and missing modalities happen frequently. Recently, machine learning approaches have been developed for data imputation but typically require fully matched multimodalities to learn common latent embeddings that potentially lack modality specificity. To address these issues, we developed an open-source machine learning model, Joint Variational Autoencoders for multimodal Imputation and Embedding (JAMIE). JAMIE takes single-cell multimodal data that can have partially matched samples across modalities. Variational autoencoders learn the latent embeddings of each modality. Then, embeddings from matched samples across modalities are aggregated to identify joint cross-modal latent embeddings before reconstruction. To perform cross-modal imputation, the latent embeddings of one modality can be used with the decoder of the other modality. For interpretability, Shapley values are used to prioritize input features for cross-modal imputation and known sample labels. We applied JAMIE to both simulation data and emerging single-cell multimodal data including gene expression, chromatin accessibility, and electrophysiology in human and mouse brains. JAMIE significantly outperforms existing state-of-the-art methods in general and prioritized multimodal features for imputation, providing potentially novel mechanistic insights at cellular resolution.

摘要

单细胞多模态数据集已经测量了单个细胞的各种特征，从而能够深入了解细胞和分子机制。然而，多模态数据生成仍然成本高昂且具有挑战性，并且模态缺失的情况经常发生。最近，已经开发了机器学习方法用于数据插补，但通常需要完全匹配的多模态来学习潜在的共同嵌入，而这些嵌入可能缺乏模态特异性。为了解决这些问题，我们开发了一种开源机器学习模型，用于多模态插补和嵌入的联合变分自编码器（JAMIE）。JAMIE采用单细胞多模态数据，这些数据在不同模态之间可以有部分匹配的样本。变分自编码器学习每个模态的潜在嵌入。然后，来自不同模态匹配样本的嵌入被聚合起来，以在重建之前识别联合跨模态潜在嵌入。为了进行跨模态插补，一个模态的潜在嵌入可以与另一个模态的解码器一起使用。为了便于解释，使用Shapley值对跨模态插补和已知样本标签的输入特征进行优先级排序。我们将JAMIE应用于模拟数据和新兴的单细胞多模态数据，包括人类和小鼠大脑中的基因表达、染色质可及性和电生理学数据。总体而言，JAMIE在性能上显著优于现有的最先进方法，并为插补确定了多模态特征优先级，在细胞分辨率上提供了潜在的新颖机制见解

相似文献

Joint variational autoencoders for multimodal imputation and embedding.

Nat Mach Intell. 2023 Jun;5(6):631-642. doi: 10.1038/s42256-023-00663-z. Epub 2023 May 29.

DeepGAMI: deep biologically guided auxiliary learning for multimodal integration and imputation to improve genotype-phenotype prediction.

Genome Med. 2023 Oct 31;15(1):88. doi: 10.1186/s13073-023-01248-6.

Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT.

Proc Natl Acad Sci U S A. 2022 Dec 6;119(49):e2214414119. doi: 10.1073/pnas.2214414119. Epub 2022 Dec 2.

Ensemble deep learning of embeddings for clustering multimodal single-cell omics data.

Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad382.

Multimodal Weibull Variational Autoencoder for Jointly Modeling Image-Text Data.

IEEE Trans Cybern. 2022 Oct;52(10):11156-11171. doi: 10.1109/TCYB.2021.3070881. Epub 2022 Sep 19.

Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited.

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3030-3047. doi: 10.1109/TPAMI.2020.3045530. Epub 2022 May 5.

Data Augmentation with Cross-Modal Variational Autoencoders (DACMVA) for Cancer Survival Prediction.

Information (Basel). 2024 Jan;15(1). doi: 10.3390/info15010007. Epub 2023 Dec 21.

InClust+: the deep generative framework with mask modules for multimodal data integration, imputation, and cross-modal generation.

BMC Bioinformatics. 2024 Jan 24;25(1):41. doi: 10.1186/s12859-024-05656-2.

scJVAE: A novel method for integrative analysis of multimodal single-cell data.

Comput Biol Med. 2023 May;158:106865. doi: 10.1016/j.compbiomed.2023.106865. Epub 2023 Apr 4.

Normative Modeling using Multimodal Variational Autoencoders to Identify Abnormal Brain Volume Deviations in Alzheimer's Disease.

Proc SPIE Int Soc Opt Eng. 2023 Feb;12465. doi: 10.1117/12.2654369. Epub 2023 Apr 7.

引用本文的文献

scDCT: a conditional diffusion-based deep learning model for high-fidelity single-cell cross-modality translation.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf400.

COEXIST: Coordinated single-cell integration of serial multiplexed tissue images.

PLoS Comput Biol. 2025 Aug 5;21(8):e1013325. doi: 10.1371/journal.pcbi.1013325. eCollection 2025 Aug.

A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.

DGAT: A Dual-Graph Attention Network for Inferring Spatial Protein Landscapes from Transcriptomics.

bioRxiv. 2025 Jul 9:2025.07.05.662121. doi: 10.1101/2025.07.05.662121.

New horizons at the interface of artificial intelligence and translational cancer research.

Cancer Cell. 2025 Apr 14;43(4):708-727. doi: 10.1016/j.ccell.2025.03.018.

Joint imputation and deconvolution of gene expression across spatial transcriptomics platforms.

bioRxiv. 2025 Feb 19:2025.02.17.638195. doi: 10.1101/2025.02.17.638195.

DPImpute: A Genotype Imputation Framework for Ultra-Low Coverage Whole-Genome Sequencing and its Application in Genomic Selection.

Adv Sci (Weinh). 2025 Apr;12(16):e2412482. doi: 10.1002/advs.202412482. Epub 2025 Feb 27.

Multiomics Research: Principles and Challenges in Integrated Analysis.

Biodes Res. 2024 Dec 5;6:0059. doi: 10.34133/bdr.0059. eCollection 2024.

Unsupervised data imputation with multiple importance sampling variational autoencoders.

Sci Rep. 2025 Jan 27;15(1):3409. doi: 10.1038/s41598-025-87641-0.

Variational graph autoencoder for reconstructed transcriptomic data associated with NLRP3 mediated pyroptosis in periodontitis.

Sci Rep. 2025 Jan 14;15(1):1962. doi: 10.1038/s41598-025-86455-4.

本文引用的文献

scAEGAN: Unification of single-cell genomics data by adversarial learning of latent space correspondences.

PLoS One. 2023 Feb 3;18(2):e0281315. doi: 10.1371/journal.pone.0281315. eCollection 2023.

scDART: integrating unmatched scRNA-seq and scATAC-seq data and learning cross-modality relationship simultaneously.

Genome Biol. 2022 Jun 27;23(1):139. doi: 10.1186/s13059-022-02706-x.

Multi-omics single-cell data integration and regulatory inference with graph-linked embedding.

Nat Biotechnol. 2022 Oct;40(10):1458-1466. doi: 10.1038/s41587-022-01284-4. Epub 2022 May 2.

A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data.

Nat Comput Sci. 2022 Jan;2(1):38-46. doi: 10.1038/s43588-021-00185-x. Epub 2022 Jan 31.

Jointly Embedding Multiple Single-Cell Omics Measurements.

Algorithms Bioinform. 2019 Sep 3;143. doi: 10.4230/LIPIcs.WABI.2019.10.

Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolution.

Cell. 2021 Sep 16;184(19):5053-5069.e23. doi: 10.1016/j.cell.2021.07.039. Epub 2021 Aug 13.

BABEL enables cross-modality translation between multiomic profiles at single-cell resolution.

Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2023070118.

Integrated Morphoelectric and Transcriptomic Classification of Cortical GABAergic Cells.

Cell. 2020 Nov 12;183(4):935-953.e19. doi: 10.1016/j.cell.2020.09.057.

Phenotypic variation of transcriptomic cell types in mouse motor cortex.

Nature. 2021 Oct;598(7879):144-150. doi: 10.1038/s41586-020-2907-3. Epub 2020 Nov 12.

Unsupervised topological alignment for single-cell multi-omics integration.

Bioinformatics. 2020 Jul 1;36(Suppl_1):i48-i56. doi: 10.1093/bioinformatics/btaa443.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于多模态插补和嵌入的联合变分自编码器

Joint variational autoencoders for multimodal imputation and embedding.

作者信息

Kalafut Noah Cohen, Huang Xiang, Wang Daifeng

机构信息

Department of Computer Sciences, Wisconsin, US.

Waisman Center, University of Wisconsin-Madison, Wisconsin, US.

出版信息

Nat Mach Intell. 2023 Jun;5(6):631-642. doi: 10.1038/s42256-023-00663-z. Epub 2023 May 29.

DOI:10.1038/s42256-023-00663-z

PMID:39175596

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11340721/

Abstract

摘要

用于多模态插补和嵌入的联合变分自编码器

Joint variational autoencoders for multimodal imputation and embedding.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

用于多模态插补和嵌入的联合变分自编码器

Joint variational autoencoders for multimodal imputation and embedding.

作者信息

机构信息

出版信息