OmiEmbed：一个用于多组学数据的统一多任务深度学习框架。

OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data.

作者信息

Zhang Xiaoyu, Xing Yuting, Sun Kai, Guo Yike

机构信息

Data Science Institute, Imperial College London, London SW7 2AZ, UK.

Department of Computer Science, Hong Kong Baptist University, Hong Kong 999077, China.

出版信息

Cancers (Basel). 2021 Jun 18;13(12):3047. doi: 10.3390/cancers13123047.

DOI:10.3390/cancers13123047

PMID:34207255

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8235477/

Abstract

High-dimensional omics data contain intrinsic biomedical information that is crucial for personalised medicine. Nevertheless, it is challenging to capture them from the genome-wide data, due to the large number of molecular features and small number of available samples, which is also called "the curse of dimensionality" in machine learning. To tackle this problem and pave the way for machine learning-aided precision medicine, we proposed a unified multi-task deep learning framework named OmiEmbed to capture biomedical information from high-dimensional omics data with the deep embedding and downstream task modules. The deep embedding module learnt an omics embedding that mapped multiple omics data types into a latent space with lower dimensionality. Based on the new representation of multi-omics data, different downstream task modules were trained simultaneously and efficiently with the multi-task strategy to predict the comprehensive phenotype profile of each sample. OmiEmbed supports multiple tasks for omics data including dimensionality reduction, tumour type classification, multi-omics integration, demographic and clinical feature reconstruction, and survival prediction. The framework outperformed other methods on all three types of downstream tasks and achieved better performance with the multi-task strategy compared to training them individually. OmiEmbed is a powerful and unified framework that can be widely adapted to various applications of high-dimensional omics data and has great potential to facilitate more accurate and personalised clinical decision making.

摘要

高维组学数据包含对个性化医疗至关重要的内在生物医学信息。然而，从全基因组数据中获取这些信息具有挑战性，这是由于分子特征数量众多而可用样本数量较少，这在机器学习中也被称为“维数灾难”。为了解决这个问题并为机器学习辅助的精准医疗铺平道路，我们提出了一个名为OmiEmbed的统一多任务深度学习框架，通过深度嵌入和下游任务模块从高维组学数据中捕获生物医学信息。深度嵌入模块学习了一种组学嵌入，将多种组学数据类型映射到一个低维的潜在空间。基于多组学数据的新表示，不同的下游任务模块通过多任务策略同时进行高效训练，以预测每个样本的综合表型特征。OmiEmbed支持多种组学数据任务，包括降维、肿瘤类型分类、多组学整合、人口统计学和临床特征重建以及生存预测。该框架在所有三种类型的下游任务上均优于其他方法，并且与单独训练相比，通过多任务策略取得了更好的性能。OmiEmbed是一个强大且统一的框架，可广泛应用于高维组学数据的各种应用，并且在促进更准确和个性化的临床决策方面具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ebc/8235477/8c6e1b8c26e5/cancers-13-03047-g001.jpg

相似文献

OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data.OmiEmbed：一个用于多组学数据的统一多任务深度学习框架。

Cancers (Basel). 2021 Jun 18;13(12):3047. doi: 10.3390/cancers13123047.

Uncertainty-aware dynamic integration for multi-omics classification of tumors.基于不确定性感知的动态集成方法在肿瘤多组学分类中的应用。

J Cancer Res Clin Oncol. 2023 Jul;149(7):3301-3312. doi: 10.1007/s00432-022-04219-3. Epub 2022 Aug 4.

AVBAE-MODFR: A novel deep learning framework of embedding and feature selection on multi-omics data for pan-cancer classification.AVBAE-MODFR：一种基于多组学数据的嵌入和特征选择的深度学习框架，用于泛癌分类。

Comput Biol Med. 2024 Jul;177:108614. doi: 10.1016/j.compbiomed.2024.108614. Epub 2024 May 14.

Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE).使用多视图因子分解自动编码器（MAE）将多组学数据与生物相互作用网络集成。

BMC Genomics. 2019 Dec 20;20(Suppl 11):944. doi: 10.1186/s12864-019-6285-x.

CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data.CLCLSA：基于对比学习和自注意力机制的交叉组学链接嵌入，用于整合不完整多组学数据的多组学集成

ArXiv. 2023 Apr 12:arXiv:2304.05542v1.

CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data.CLCLSA：基于对比学习和自注意力机制的跨组学链接嵌入，用于整合不完整多组学数据的多组学整合。

Res Sq. 2023 May 2:rs.3.rs-2768563. doi: 10.21203/rs.3.rs-2768563/v1.

Deep latent space fusion for adaptive representation of heterogeneous multi-omics data.深度潜在空间融合用于异构多组学数据的自适应表示。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab600.

Capturing the latent space of an Autoencoder for multi-omics integration and cancer subtyping.捕获自动编码器的潜在空间，用于多组学整合和癌症亚型分类。

Comput Biol Med. 2022 Sep;148:105832. doi: 10.1016/j.compbiomed.2022.105832. Epub 2022 Jul 5.

JDSNMF: Joint Deep Semi-Non-Negative Matrix Factorization for Learning Integrative Representation of Molecular Signals in Alzheimer's Disease.JDSNMF：用于学习阿尔茨海默病分子信号综合表征的联合深度半非负矩阵分解

J Pers Med. 2021 Jul 21;11(8):686. doi: 10.3390/jpm11080686.

Multi-omics integration method based on attention deep learning network for biomedical data classification.基于注意力深度学习网络的多组学整合方法用于生物医学数据分类

Comput Methods Programs Biomed. 2023 Apr;231:107377. doi: 10.1016/j.cmpb.2023.107377. Epub 2023 Jan 27.

引用本文的文献

Multi-task machine learning for transfusion decision support in acute upper gastrointestinal bleeding: a novel ensemble approach with clinical validation.用于急性上消化道出血输血决策支持的多任务机器学习：一种经过临床验证的新型集成方法

J Transl Med. 2025 Sep 2;23(1):979. doi: 10.1186/s12967-025-06995-1.

A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.多组学数据整合方法的技术综述：从经典统计方法到深度生成方法

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.

Integrating Artificial Intelligence in Next-Generation Sequencing: Advances, Challenges, and Future Directions.将人工智能整合到下一代测序中：进展、挑战与未来方向。

Curr Issues Mol Biol. 2025 Jun 19;47(6):470. doi: 10.3390/cimb47060470.

Multi-omics decodes host-specific and environmental microbiome interactions in sepsis.多组学解析脓毒症中宿主特异性和环境微生物组的相互作用。

Front Microbiol. 2025 Jun 26;16:1618177. doi: 10.3389/fmicb.2025.1618177. eCollection 2025.

Multimodal CustOmics: A unified and interpretable multi-task deep learning framework for multimodal integrative data analysis in oncology.多模态定制组学：一种用于肿瘤学多模态整合数据分析的统一且可解释的多任务深度学习框架。

PLoS Comput Biol. 2025 Jun 17;21(6):e1013012. doi: 10.1371/journal.pcbi.1013012. eCollection 2025 Jun.

Artificial Intelligence in cancer epigenomics: a review on advances in pan-cancer detection and precision medicine.癌症表观基因组学中的人工智能：泛癌检测与精准医学进展综述

Epigenetics Chromatin. 2025 Jun 14;18(1):35. doi: 10.1186/s13072-025-00595-5.

Navigating the Multiverse: a Hitchhiker's guide to selecting harmonization methods for multimodal biomedical data.探索多元宇宙：多模态生物医学数据协调方法选择指南

Biol Methods Protoc. 2025 Apr 17;10(1):bpaf028. doi: 10.1093/biomethods/bpaf028. eCollection 2025.

A comprehensive review of cancer survival prediction using multi-omics integration and clinical variables.使用多组学整合和临床变量进行癌症生存预测的综合综述。

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf150.

Amyotrophic lateral sclerosis diagnosis using machine learning and multi-omic data integration.使用机器学习和多组学数据整合进行肌萎缩侧索硬化症诊断

Heliyon. 2024 Oct 1;10(20):e38583. doi: 10.1016/j.heliyon.2024.e38583. eCollection 2024 Oct 30.

Deep learning-based approaches for multi-omics data integration and analysis.基于深度学习的多组学数据整合与分析方法。

BioData Min. 2024 Oct 2;17(1):38. doi: 10.1186/s13040-024-00391-z.

本文引用的文献

XOmiVAE: an interpretable deep learning model for cancer classification using high-dimensional omics data.XOmiVAE：一种使用高维组学数据进行癌症分类的可解释深度学习模型。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab315.

Crosstalk between microRNA expression and DNA methylation drives the hormone-dependent phenotype of breast cancer.miRNA 表达与 DNA 甲基化的串扰驱动乳腺癌的激素依赖性表型。

Genome Med. 2021 Apr 29;13(1):72. doi: 10.1186/s13073-021-00880-4.

A Comprehensive Survey on Graph Neural Networks.图神经网络综述。

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.

DeePathology: Deep Multi-Task Learning for Inferring Molecular Pathology from Cancer Transcriptome.DeePathology：从癌症转录组推断分子病理学的深度多任务学习。

Sci Rep. 2019 Nov 11;9(1):16526. doi: 10.1038/s41598-019-52937-5.

Exploring single-cell data with deep multitasking neural networks.用深度多任务神经网络探索单细胞数据。

Nat Methods. 2019 Nov;16(11):1139-1145. doi: 10.1038/s41592-019-0576-7. Epub 2019 Oct 7.

Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases.机器学习分析 DNA 甲基化图谱可区分原发性肺鳞癌和头颈部转移。

Sci Transl Med. 2019 Sep 11;11(509). doi: 10.1126/scitranslmed.aaw8513.

Deep learning with multimodal representation for pancancer prognosis prediction.基于多模态表示的深度学习在泛癌预后预测中的应用。

Bioinformatics. 2019 Jul 15;35(14):i446-i454. doi: 10.1093/bioinformatics/btz342.

SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer.SALMON：基于多组学神经网络的乳腺癌生存分析学习

Front Genet. 2019 Mar 8;10:166. doi: 10.3389/fgene.2019.00166. eCollection 2019.

Single-cell RNA-seq denoising using a deep count autoencoder.基于深度计数自编码器的单细胞 RNA-seq 去噪。

Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.

Deep generative modeling for single-cell transcriptomics.单细胞转录组学的深度生成模型。

Nat Methods. 2018 Dec;15(12):1053-1058. doi: 10.1038/s41592-018-0229-2. Epub 2018 Nov 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

OmiEmbed：一个用于多组学数据的统一多任务深度学习框架。

OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献