通过学习最大化相关性表示进行深度多组学整合可识别出具有预后分层的癌症亚型。

Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes.

作者信息

Ji Yanrong, Dutta Pratik, Davuluri Ramana

机构信息

Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.

Department of Biomedical Informatics, Stony Brook Cancer Center, Stony Brook Medicine, Stony Brook University, Stony Brook, NY 11794, USA.

出版信息

Bioinform Adv. 2023 Jun 21;3(1):vbad075. doi: 10.1093/bioadv/vbad075. eCollection 2023.

DOI:10.1093/bioadv/vbad075

PMID:37424943

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10328436/

Abstract

MOTIVATION

Molecular subtyping by integrative modeling of multi-omics and clinical data can help the identification of robust and clinically actionable disease subgroups; an essential step in developing precision medicine approaches.

RESULTS

We developed a novel outcome-guided molecular subgrouping framework, called Deep Multi-Omics Integrative Subtyping by Maximizing Correlation (DeepMOIS-MC), for integrative learning from multi-omics data by maximizing correlation between all input -omics views. DeepMOIS-MC consists of two parts: clustering and classification. In the clustering part, the preprocessed high-dimensional multi-omics views are input into two-layer fully connected neural networks. The outputs of individual networks are subjected to Generalized Canonical Correlation Analysis loss to learn the shared representation. Next, the learned representation is filtered by a regression model to select features that are related to a covariate clinical variable, for example, a survival/outcome. The filtered features are used for clustering to determine the optimal cluster assignments. In the classification stage, the original feature matrix of one of the -omics view is scaled and discretized based on equal frequency binning, and then subjected to feature selection using RandomForest. Using these selected features, classification models (for example, XGBoost model) are built to predict the molecular subgroups that were identified at clustering stage. We applied DeepMOIS-MC on lung and liver cancers, using TCGA datasets. In comparative analysis, we found that DeepMOIS-MC outperformed traditional approaches in patient stratification. Finally, we validated the robustness and generalizability of the classification models on independent datasets. We anticipate that the DeepMOIS-MC can be adopted to many multi-omics integrative analyses tasks.

AVAILABILITY AND IMPLEMENTATION

Source codes for PyTorch implementation of DGCCA and other DeepMOIS-MC modules are available at GitHub (https://github.com/duttaprat/DeepMOIS-MC).

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

通过多组学和临床数据的整合建模进行分子亚型分类有助于识别稳健且具有临床可操作性的疾病亚组，这是开发精准医学方法的关键步骤。

结果

我们开发了一种新的结果导向型分子亚组分类框架，称为最大化相关性的深度多组学整合亚型分类法（DeepMOIS-MC），用于通过最大化所有输入组学视图之间的相关性从多组学数据中进行整合学习。DeepMOIS-MC由两部分组成：聚类和分类。在聚类部分，将预处理后的高维多组学视图输入到两层全连接神经网络中。各个网络的输出经过广义典型相关分析损失以学习共享表示。接下来，通过回归模型对学习到的表示进行过滤，以选择与协变量临床变量（例如生存/结果）相关的特征。过滤后的特征用于聚类以确定最佳聚类分配。在分类阶段，对其中一个组学视图的原始特征矩阵基于等频分箱进行缩放和离散化，然后使用随机森林进行特征选择。使用这些选定的特征构建分类模型（例如XGBoost模型）来预测在聚类阶段识别出的分子亚组。我们使用TCGA数据集将DeepMOIS-MC应用于肺癌和肝癌。在比较分析中，我们发现DeepMOIS-MC在患者分层方面优于传统方法。最后，我们在独立数据集上验证了分类模型的稳健性和通用性。我们预计DeepMOIS-MC可应用于许多多组学整合分析任务。

可用性和实现

DGCCA和其他DeepMOIS-MC模块的PyTorch实现的源代码可在GitHub（https://github.com/duttaprat/DeepMOIS-MC）上获取。

补充信息

补充数据可在网上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54ab/10328436/cc51ff63b571/vbad075f1.jpg

相似文献

Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes.通过学习最大化相关性表示进行深度多组学整合可识别出具有预后分层的癌症亚型。

Bioinform Adv. 2023 Jun 21;3(1):vbad075. doi: 10.1093/bioadv/vbad075. eCollection 2023.

Deep structure integrative representation of multi-omics data for cancer subtyping.多组学数据的深度结构综合表示用于癌症亚型分类。

Bioinformatics. 2022 Jun 27;38(13):3337-3342. doi: 10.1093/bioinformatics/btac345.

Capturing the latent space of an Autoencoder for multi-omics integration and cancer subtyping.捕获自动编码器的潜在空间，用于多组学整合和癌症亚型分类。

Comput Biol Med. 2022 Sep;148:105832. doi: 10.1016/j.compbiomed.2022.105832. Epub 2022 Jul 5.

Subtype-GAN: a deep learning approach for integrative cancer subtyping of multi-omics data.亚型生成对抗网络（Subtype-GAN）：一种用于多组学数据综合癌症亚型分析的深度学习方法。

Bioinformatics. 2021 Aug 25;37(16):2231-2237. doi: 10.1093/bioinformatics/btab109.

MOVICS: an R package for multi-omics integration and visualization in cancer subtyping.MOVICS：一个用于癌症亚型多组学整合与可视化的R包。

Bioinformatics. 2021 Apr 1;36(22-23):5539-5541. doi: 10.1093/bioinformatics/btaa1018.

Cancer subtype identification by consensus guided graph autoencoders.基于共识引导图自编码器的癌症亚型识别。

Bioinformatics. 2021 Dec 11;37(24):4779-4786. doi: 10.1093/bioinformatics/btab535.

Subtype-MGTP: a cancer subtype identification framework based on multi-omics translation.基于多组学翻译的癌症亚型识别框架 Subtype-MGTP

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae360.

Cancer subtype identification by multi-omics clustering based on interpretable feature and latent subspace learning.基于可解释特征和潜在子空间学习的多组学聚类进行癌症亚型识别。

Methods. 2024 Nov;231:144-153. doi: 10.1016/j.ymeth.2024.09.014. Epub 2024 Sep 24.

SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration.SDGCCA：用于多组学整合的有监督深度广义典型相关分析。

J Comput Biol. 2022 Aug;29(8):892-907. doi: 10.1089/cmb.2021.0598.

Deep learning-based ovarian cancer subtypes identification using multi-omics data.基于深度学习的多组学数据卵巢癌亚型识别

BioData Min. 2020 Aug 24;13:10. doi: 10.1186/s13040-020-00222-x. eCollection 2020.

引用本文的文献

Clinical Characteristics of Patients With Respiratory Infections After Nonpharmacological Interventions for COVID-19 in China Have Ended: Using Machine Learning Approaches to Support Pathogen Prediction at Admission.中国新冠疫情非药物干预措施结束后呼吸道感染患者的临床特征：采用机器学习方法辅助入院时病原体预测

Immun Inflamm Dis. 2025 Aug;13(8):e70237. doi: 10.1002/iid3.70237.

Intervention of machine learning in bladder cancer research using multi-omics datasets: systematic review on biomarker identification.利用多组学数据集的机器学习在膀胱癌研究中的干预：生物标志物识别的系统评价

Discov Oncol. 2025 Jun 5;16(1):1010. doi: 10.1007/s12672-025-02734-6.

本文引用的文献

Visualizing and interpreting cancer genomics data via the Xena platform.通过Xena平台可视化和解读癌症基因组学数据。

Nat Biotechnol. 2020 Jun;38(6):675-678. doi: 10.1038/s41587-020-0546-8.

Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice.用于癌症数据整合的变分自编码器：设计原理与计算实践

Front Genet. 2019 Dec 11;10:1205. doi: 10.3389/fgene.2019.01205. eCollection 2019.

CEPICS: A Comparison and Evaluation Platform for Integration Methods in Cancer Subtyping.CEPICS：癌症亚型分类中整合方法的比较与评估平台

Front Genet. 2019 Oct 8;10:966. doi: 10.3389/fgene.2019.00966. eCollection 2019.

Specific glioblastoma multiforme prognostic-subtype distinctions based on DNA methylation patterns.基于 DNA 甲基化模式的特定胶质母细胞瘤多形性预后亚型区分。

Cancer Gene Ther. 2020 Sep;27(9):702-714. doi: 10.1038/s41417-019-0142-6. Epub 2019 Oct 16.

Refinement of breast cancer molecular classification by miRNA expression profiles.通过微小RNA表达谱对乳腺癌分子分类进行细化

BMC Genomics. 2019 Jun 17;20(1):503. doi: 10.1186/s12864-019-5887-7.

Platform-Independent Classification System to Predict Molecular Subtypes of High-Grade Serous Ovarian Carcinoma.用于预测高级别浆液性卵巢癌分子亚型的平台无关分类系统

JCO Clin Cancer Inform. 2019 Apr;3:1-9. doi: 10.1200/CCI.18.00096.

PINSPlus: a tool for tumor subtype discovery in integrated genomic data.PINSPlus：一种整合基因组数据中肿瘤亚型发现的工具。

Bioinformatics. 2019 Aug 15;35(16):2843-2846. doi: 10.1093/bioinformatics/bty1049.

Deep Learning-Based Multi-Omics Data Integration Reveals Two Prognostic Subtypes in High-Risk Neuroblastoma.基于深度学习的多组学数据整合揭示高危神经母细胞瘤的两种预后亚型。

Front Genet. 2018 Oct 18;9:477. doi: 10.3389/fgene.2018.00477. eCollection 2018.

Multi-omic and multi-view clustering algorithms: review and cancer benchmark.多组学和多视角聚类算法：综述和癌症基准测试。

Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562. doi: 10.1093/nar/gky889.

Deep Learning data integration for better risk stratification models of bladder cancer.用于改进膀胱癌风险分层模型的深度学习数据整合

AMIA Jt Summits Transl Sci Proc. 2018 May 18;2017:197-206. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过学习最大化相关性表示进行深度多组学整合可识别出具有预后分层的癌症亚型。

Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献