整合来自不同来源的多模态数据以识别疾病亚型。

Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes.

作者信息

Zhou Kaiyue, Kottoori Bhagya Shree, Munj Seeya Awadhut, Zhang Zhewei, Draghici Sorin, Arslanturk Suzan

机构信息

Department of Computer Science, Wayne State University, Detroit, MI 48201, USA.

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China.

出版信息

Biology (Basel). 2022 Feb 24;11(3):360. doi: 10.3390/biology11030360.

DOI:10.3390/biology11030360

PMID:35336734

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8945377/

Abstract

Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a single modality due to the heterogeneity of disease. Using a scientifically developed and tested deep-learning approach that leverages aggregate information collected from multiple repositories with multiple modalities (e.g., mRNA, DNA Methylation, miRNA) could lead to a more accurate and robust prediction of disease progression. Here, we propose an autoencoder based multimodal data fusion system, in which a fusion encoder flexibly integrates collective information available through multiple studies with partially coupled data. Our results on a fully controlled simulation-based study have shown that inferring the missing data through the proposed data fusion pipeline allows a predictor that is superior to other baseline predictors with missing modalities. Results have further shown that short- and long-term survivors of glioblastoma multiforme, acute myeloid leukemia, and pancreatic adenocarcinoma can be successfully differentiated with an AUC of 0.94, 0.75, and 0.96, respectively.

摘要

过去十年的研究产生了大量分子数据，这些数据可用于更好地理解癌症风险、进展和预后。然而，由于疾病的异质性，仅通过分析单一模式的数据无法了解进展风险，也无法区分长期和短期幸存者。使用一种经过科学开发和测试的深度学习方法，该方法利用从多个具有多种模式（如mRNA、DNA甲基化、miRNA）的数据库收集的汇总信息，可能会对疾病进展做出更准确、更可靠的预测。在此，我们提出一种基于自动编码器的多模态数据融合系统，其中融合编码器灵活地将多个研究中可用的集体信息与部分耦合的数据进行整合。我们在一项完全可控的基于模拟的研究中的结果表明，通过所提出的数据融合管道推断缺失数据，能够得到一个优于其他具有缺失模式的基线预测器的预测器。结果还进一步表明，多形性胶质母细胞瘤、急性髓系白血病和胰腺腺癌的短期和长期幸存者能够成功区分，其曲线下面积分别为0.94、0.75和0.96。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcca/8945377/6b078cebe6a2/biology-11-00360-g001.jpg

相似文献

Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes.整合来自不同来源的多模态数据以识别疾病亚型。

Biology (Basel). 2022 Feb 24;11(3):360. doi: 10.3390/biology11030360.

Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.基于深度学习的多组学生物标志物数据特征层融合在乳腺癌患者生存分析中的应用。

BMC Med Inform Decis Mak. 2020 Sep 15;20(1):225. doi: 10.1186/s12911-020-01225-8.

Imitation and mirror systems in robots through Deep Modality Blending Networks.机器人中的模仿和镜像系统通过深度模态混合网络。

Neural Netw. 2022 Feb;146:22-35. doi: 10.1016/j.neunet.2021.11.004. Epub 2021 Nov 16.

Long-term cancer survival prediction using multimodal deep learning.基于多模态深度学习的癌症长期生存预测。

Sci Rep. 2021 Jun 29;11(1):13505. doi: 10.1038/s41598-021-92799-4.

Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation.通过参数高效自适应实现缺失模态的稳健多模态学习。

IEEE Trans Pattern Anal Mach Intell. 2025 Feb;47(2):742-754. doi: 10.1109/TPAMI.2024.3476487. Epub 2025 Jan 9.

Leveraging hierarchy in multimodal generative models for effective cross-modality inference.利用多模态生成模型中的层次结构进行有效的跨模态推理。

Neural Netw. 2022 Feb;146:238-255. doi: 10.1016/j.neunet.2021.11.019. Epub 2021 Nov 24.

Improving exchange rate forecasting via a new deep multimodal fusion model.通过一种新的深度多模态融合模型改进汇率预测。

Appl Intell (Dordr). 2022;52(14):16701-16717. doi: 10.1007/s10489-022-03342-5. Epub 2022 Mar 25.

Multimodal MRI Image Decision Fusion-Based Network for Glioma Classification.基于多模态MRI图像决策融合的胶质瘤分类网络

Front Oncol. 2022 Feb 24;12:819673. doi: 10.3389/fonc.2022.819673. eCollection 2022.

Richer fusion network for breast cancer classification based on multimodal data.基于多模态数据的乳腺癌分类更丰富的融合网络。

BMC Med Inform Decis Mak. 2021 Apr 22;21(Suppl 1):134. doi: 10.1186/s12911-020-01340-6.

A multimodal Parkinson quantification by fusing eye and gait motion patterns, using covariance descriptors, from non-invasive computer vision.基于计算机视觉的非侵入式方法，融合眼动和步态运动模式，使用协方差描述符进行多模态帕金森量化。

Comput Methods Programs Biomed. 2022 Mar;215:106607. doi: 10.1016/j.cmpb.2021.106607. Epub 2021 Dec 30.

引用本文的文献

Unraveling patient heterogeneity in complex diseases through individualized co-expression networks: a perspective.通过个性化共表达网络解析复杂疾病中的患者异质性：一种观点

Front Genet. 2023 Aug 10;14:1209416. doi: 10.3389/fgene.2023.1209416. eCollection 2023.

本文引用的文献

Integrative Analysis of MUC4 to Prognosis and Immune Infiltration in Pan-Cancer: Friend or Foe?MUC4在泛癌中的预后及免疫浸润综合分析：是友还是敌？

Front Cell Dev Biol. 2021 Jul 16;9:695544. doi: 10.3389/fcell.2021.695544. eCollection 2021.

Discovery of primary prostate cancer biomarkers using cross cancer learning.应用跨癌种学习发现原发性前列腺癌生物标志物

Sci Rep. 2021 May 17;11(1):10433. doi: 10.1038/s41598-021-89789-x.

Crosstalk between microRNA expression and DNA methylation drives the hormone-dependent phenotype of breast cancer.miRNA 表达与 DNA 甲基化的串扰驱动乳腺癌的激素依赖性表型。

Genome Med. 2021 Apr 29;13(1):72. doi: 10.1186/s13073-021-00880-4.

Risk Prediction in Patients With Heart Failure With Preserved Ejection Fraction Using Gene Expression Data and Machine Learning.利用基因表达数据和机器学习对射血分数保留的心力衰竭患者进行风险预测

Front Genet. 2021 Mar 22;12:652315. doi: 10.3389/fgene.2021.652315. eCollection 2021.

A novel computational strategy for DNA methylation imputation using mixture regression model (MRM).一种基于混合回归模型（MRM）的新型 DNA 甲基化推断计算策略。

BMC Bioinformatics. 2020 Dec 1;21(1):552. doi: 10.1186/s12859-020-03865-z.

Unsupervised Hierarchical Clustering of Pancreatic Adenocarcinoma Dataset from TCGA Defines a Mucin Expression Profile that Impacts Overall Survival.对来自TCGA的胰腺腺癌数据集进行无监督层次聚类，确定了一种影响总生存期的粘蛋白表达谱。

Cancers (Basel). 2020 Nov 9;12(11):3309. doi: 10.3390/cancers12113309.

Survival differences and associated molecular signatures of DNMT3A-mutant acute myeloid leukemia patients.DNMT3A 突变型急性髓系白血病患者的生存差异及相关分子特征。

Sci Rep. 2020 Jul 29;10(1):12761. doi: 10.1038/s41598-020-69691-8.

Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network.基于迁移学习的神经网络对 RNA 测序缺失数据进行推断。

Gigascience. 2020 Jul 1;9(7). doi: 10.1093/gigascience/giaa076.

Molecular Characterization of Astrocytoma Progression Towards Secondary Glioblastomas Utilizing Patient-Matched Tumor Pairs.利用患者匹配的肿瘤对进行星形细胞瘤向继发性胶质母细胞瘤进展的分子特征分析。

Cancers (Basel). 2020 Jun 26;12(6):1696. doi: 10.3390/cancers12061696.

Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data.基于基因表达数据的癌症生存预测的卷积神经网络迁移学习。

PLoS One. 2020 Mar 26;15(3):e0230536. doi: 10.1371/journal.pone.0230536. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

整合来自不同来源的多模态数据以识别疾病亚型。

Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献