• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

整合网络融合:分子谱分析中的多组学方法

Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling.

作者信息

Chierici Marco, Bussola Nicole, Marcolini Alessia, Francescatto Margherita, Zandonà Alessandro, Trastulla Lucia, Agostinelli Claudio, Jurman Giuseppe, Furlanello Cesare

机构信息

Fondazione Bruno Kessler, Trento, Italy.

University of Trento, Trento, Italy.

出版信息

Front Oncol. 2020 Jun 30;10:1065. doi: 10.3389/fonc.2020.01065. eCollection 2020.

DOI:10.3389/fonc.2020.01065
PMID:32714870
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7340129/
Abstract

Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER, = 381) and breast invasive carcinoma subtypes (BRCA-subtypes, = 305), while gene expression, miRNA expression and methylation data is used as predictor layers for acute myeloid leukemia and renal clear cell carcinoma survival (AML-OS, = 157; KIRC-OS, = 181). In test, INF achieved similar Matthews Correlation Coefficient (MCC) values and 97% to 83% smaller feature sizes (FS), compared with juXT for BRCA-ER (MCC: 0.83 vs. 0.80; FS: 56 vs. 1801) and BRCA-subtypes (0.84 vs. 0.80; 302 vs. 1801), improving KIRC-OS performance (0.38 vs. 0.31; 111 vs. 2319). INF predictions are generally more accurate in test than one-dimensional omics models, with smaller signatures too, where transcriptomics consistently play the leading role. Overall, the INF framework effectively integrates multiple data levels in oncogenomics classification tasks, improving over the performance of single layers alone and naive juxtaposition, and provides compact signature sizes.

摘要

近期的技术进步以及国际合作努力,如癌症基因组图谱(TCGA)项目,已提供了多个泛癌数据集,这些数据集包含多个组学层面的数据,并在大量样本中附有详细的临床信息。因此,开发旨在从多模态数据中改进癌症亚型分类和生物标志物识别的计算方法变得十分必要。在此,我们应用整合网络融合(INF)流程,该流程在机器学习预测框架内,利用相似性网络融合(SNF)整合多个组学层面的数据。INF包括一种基于SNF整合特征的特征排序方案(rSNF),由一个分类器用于并列的多组学特征(juXT)。具体而言,我们展示了将随机森林(RF)和线性支持向量机(LSVM)作为分类器的INF实例,并且还在juXT上训练了两个基线RF和LSVM模型。最终得出一个紧凑的RF模型,称为rSNFi,它是在juXT和rSNF这两种方法中排名靠前的生物标志物的交集上进行训练的。所有分类器均按照美国食品药品监督管理局(FDA)主导的MAQC/SEQC计划中无偏数据分析计划的指导方针,以10×5折交叉验证模式运行,以确保结果的可重复性。在三个多模态TCGA肿瘤基因组学数据集上的四项分类任务中对INF进行了验证。基因表达、蛋白质表达和拷贝数变异被用于预测雌激素受体状态(BRCA-ER,样本量 = 381)和乳腺浸润癌亚型(BRCA-亚型,样本量 = 305),而基因表达、miRNA表达和甲基化数据则被用作急性髓系白血病和肾透明细胞癌生存率(AML-OS,样本量 = 157;KIRC-OS,样本量 = 181)的预测层。在测试中,与juXT相比,INF在BRCA-ER(马修斯相关系数(MCC):0.83对0.80;特征数量(FS):56对1801)和BRCA-亚型(0.84对0.80;302对1801)上实现了相似的MCC值,且特征数量缩小了97%至83%,同时提高了KIRC-OS的性能(0.38对0.31;111对2319)。在测试中,INF的预测通常比一维组学模型更准确,特征标记也更小,其中转录组学始终发挥主导作用。总体而言,INF框架在肿瘤基因组学分类任务中有效地整合了多个数据层面,比单独的单层数据和简单并列的性能有所提升,并提供了紧凑的特征标记大小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/9445a3ba96f9/fonc-10-01065-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/1b8ab7eeb566/fonc-10-01065-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/6539cd060212/fonc-10-01065-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/3584e3142e24/fonc-10-01065-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/2f571a09c563/fonc-10-01065-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/531747fcc672/fonc-10-01065-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/9445a3ba96f9/fonc-10-01065-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/1b8ab7eeb566/fonc-10-01065-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/6539cd060212/fonc-10-01065-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/3584e3142e24/fonc-10-01065-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/2f571a09c563/fonc-10-01065-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/531747fcc672/fonc-10-01065-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b707/7340129/9445a3ba96f9/fonc-10-01065-g0006.jpg

相似文献

1
Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling.整合网络融合:分子谱分析中的多组学方法
Front Oncol. 2020 Jun 30;10:1065. doi: 10.3389/fonc.2020.01065. eCollection 2020.
2
MoGCN: A Multi-Omics Integration Method Based on Graph Convolutional Network for Cancer Subtype Analysis.MoGCN:一种基于图卷积网络的多组学整合方法用于癌症亚型分析。
Front Genet. 2022 Feb 2;13:806842. doi: 10.3389/fgene.2022.806842. eCollection 2022.
3
Machine learning combining multi-omics data and network algorithms identifies adrenocortical carcinoma prognostic biomarkers.结合多组学数据和网络算法的机器学习可识别肾上腺皮质癌预后生物标志物。
Front Mol Biosci. 2023 Nov 6;10:1258902. doi: 10.3389/fmolb.2023.1258902. eCollection 2023.
4
A multimodal graph neural network framework for cancer molecular subtype classification.一种用于癌症分子亚型分类的多模态图神经网络框架。
BMC Bioinformatics. 2024 Jan 15;25(1):27. doi: 10.1186/s12859-023-05622-4.
5
Comparison of five supervised feature selection algorithms leading to top features and gene signatures from multi-omics data in cancer.比较五种监督特征选择算法,这些算法可从癌症的多组学数据中得到顶级特征和基因特征。
BMC Bioinformatics. 2022 Apr 28;23(Suppl 3):153. doi: 10.1186/s12859-022-04678-y.
6
Multi-omics integration for neuroblastoma clinical endpoint prediction.多组学整合用于神经母细胞瘤临床终点预测。
Biol Direct. 2018 Apr 3;13(1):5. doi: 10.1186/s13062-018-0207-8.
7
Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data.基于多组学数据预测卵巢癌生存的最小冗余最大相关性多视图特征选择。
BMC Med Genomics. 2018 Sep 14;11(Suppl 3):71. doi: 10.1186/s12920-018-0388-0.
8
Survey and comparative assessments of computational multi-omics integrative methods with multiple regulatory networks identifying distinct tumor compositions across pan-cancer data sets.对具有多个调控网络的计算多组学综合方法进行调查和比较评估,以识别泛癌数据集之间不同的肿瘤组成。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa102.
9
NNBGWO-BRCA marker: Neural Network and binary grey wolf optimization based Breast cancer biomarker discovery framework using multi-omics dataset.基于神经网络和二进制灰狼优化的乳腺癌生物标志物发现框架,利用多组学数据集。
Comput Methods Programs Biomed. 2024 Sep;254:108291. doi: 10.1016/j.cmpb.2024.108291. Epub 2024 Jun 18.
10
Classifying breast cancer using multi-view graph neural network based on multi-omics data.基于多组学数据,使用多视图图神经网络对乳腺癌进行分类。
Front Genet. 2024 Feb 20;15:1363896. doi: 10.3389/fgene.2024.1363896. eCollection 2024.

引用本文的文献

1
Towards machine learning fairness in classifying multicategory causes of deaths in colorectal or lung cancer patients.迈向结直肠癌或肺癌患者多类别死亡原因分类中的机器学习公平性。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf398.
2
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.多组学数据整合方法的技术综述:从经典统计方法到深度生成方法
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.
3
Current Bioinformatics Tools in Precision Oncology.精准肿瘤学中的当前生物信息学工具

本文引用的文献

1
DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data.DeepProg:一种使用多组学数据进行预后预测的深度学习和机器学习模型的集成。
Genome Med. 2021 Jul 14;13(1):112. doi: 10.1186/s13073-021-00930-x.
2
Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer.基于癌症研究的联合多组学降维方法的基准测试。
Nat Commun. 2021 Jan 5;12(1):124. doi: 10.1038/s41467-020-20430-7.
3
Integrative analysis of breast cancer profiles in TCGA by TNBC subgrouping reveals novel microRNA-specific clusters, including miR-17-92a, distinguishing basal-like 1 and basal-like 2 TNBC subtypes.
MedComm (2020). 2025 Jul 9;6(7):e70243. doi: 10.1002/mco2.70243. eCollection 2025 Jul.
4
Network-based analyses of multiomics data in biomedicine.生物医药中多组学数据的基于网络的分析。
BioData Min. 2025 May 27;18(1):37. doi: 10.1186/s13040-025-00452-x.
5
Evaluating the factors influencing accuracy, interpretability, and reproducibility in the use of machine learning classifiers in biology to enable standardization.评估影响生物学中机器学习分类器使用的准确性、可解释性和可重复性的因素,以实现标准化。
Sci Rep. 2025 May 13;15(1):16651. doi: 10.1038/s41598-025-00245-6.
6
Towards machine learning fairness in classifying multicategory causes of deaths in colorectal or lung cancer patients.迈向结直肠癌或肺癌患者多类别死因分类中的机器学习公平性
bioRxiv. 2025 Feb 19:2025.02.14.638368. doi: 10.1101/2025.02.14.638368.
7
Innovative molecular networking analysis of steroids and characterisation of the urinary steroidome.创新的甾体分子网络分析及尿甾体组学特征。
Sci Data. 2024 Jul 24;11(1):818. doi: 10.1038/s41597-024-03599-0.
8
A framework for block-wise missing data in multi-omics.多组学中基于块的缺失数据框架。
PLoS One. 2024 Jul 23;19(7):e0307482. doi: 10.1371/journal.pone.0307482. eCollection 2024.
9
Towards Identification of Genes Contributing to Similarity of Patients' Multi-Omics Profiles: A Case Study of Acute Myeloid Leukemia.面向鉴定导致患者多组学特征相似的基因:以急性髓系白血病为例的研究。
Genes (Basel). 2023 Sep 13;14(9):1795. doi: 10.3390/genes14091795.
10
Multi-View Learning to Unravel the Different Levels Underlying Hepatitis B Vaccine Response.多视图学习以揭示乙肝疫苗反应背后的不同层次
Vaccines (Basel). 2023 Jul 13;11(7):1236. doi: 10.3390/vaccines11071236.
TCGA 中三阴性乳腺癌亚组的乳腺癌特征的综合分析揭示了新的 miRNA 特异性簇,包括 miR-17-92a,可区分基底样 1 和基底样 2 三阴性乳腺癌亚型。
BMC Cancer. 2020 Feb 21;20(1):141. doi: 10.1186/s12885-020-6600-6.
4
Is a Potential Biomarker Predicting Shorter Overall Survival in Patients with Non-M3/ Wildtype Acute Myeloid Leukemia.是否存在一个潜在的生物标志物可以预测非 M3/野生型急性髓系白血病患者的总生存期更短。
DNA Cell Biol. 2020 Apr;39(4):700-708. doi: 10.1089/dna.2019.5187. Epub 2020 Feb 20.
5
Multi-omics Data Integration for Identifying Osteoporosis Biomarkers and Their Biological Interaction and Causal Mechanisms.用于识别骨质疏松症生物标志物及其生物学相互作用和因果机制的多组学数据整合
iScience. 2020 Feb 21;23(2):100847. doi: 10.1016/j.isci.2020.100847. Epub 2020 Jan 17.
6
Network-Based Approaches for Multi-omics Integration.基于网络的多组学整合方法。
Methods Mol Biol. 2020;2104:469-487. doi: 10.1007/978-1-0716-0239-3_23.
7
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
8
Integrative omics approaches provide biological and clinical insights: examples from mitochondrial diseases.整合组学方法提供生物学和临床见解:来自线粒体疾病的例子。
J Clin Invest. 2020 Jan 2;130(1):20-28. doi: 10.1172/JCI129202.
9
PROMO: an interactive tool for analyzing clinically-labeled multi-omic cancer datasets.宣传册:一种用于分析临床标记的多组学癌症数据集的交互式工具。
BMC Bioinformatics. 2019 Dec 26;20(1):732. doi: 10.1186/s12859-019-3142-5.
10
Vertical and horizontal integration of multi-omics data with miodin.多维组学数据与 miodin 的垂直和水平整合。
BMC Bioinformatics. 2019 Dec 10;20(1):649. doi: 10.1186/s12859-019-3224-4.