• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于稳健生物标志物发现的综合多组学随机森林框架。

An Integrative Multi-Omics Random Forest Framework for Robust Biomarker Discovery.

作者信息

Zhang Wei, Huang Hanchen, Wang Lily, Lehmann Brian D, Chen Steven X

机构信息

Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA.

Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL 33136, USA.

出版信息

bioRxiv. 2025 Mar 6:2025.03.05.641533. doi: 10.1101/2025.03.05.641533.

DOI:10.1101/2025.03.05.641533
PMID:40093058
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11908250/
Abstract

High-throughput technologies now produce a wide array of omics data, from genomic and transcriptomic profiles to epigenomic and proteomic measurements. Integrating these diverse data types can yield deeper insights into the biological mechanisms driving complex traits and diseases. Yet, extracting key shared biomarkers from multiple data layers remains a major challenge. We present a multivariate random forest (MRF)-based framework enhanced by a novel inverse minimal depth (IMD) metric for integrative variable selection. By assigning response variables to tree nodes and employing IMD to rank predictors, our approach efficiently identifies essential features across different omics types, even when confronted with high-dimensionality and noise. Through extensive simulations and analyses of multi-omics datasets from The Cancer Genome Atlas, we demonstrate that our method outperforms established integrative techniques in uncovering biologically meaningful biomarkers and pathways. Our findings show that selected biomarkers not only correlate with known regulatory and signaling networks but can also stratify patient subgroups with distinct clinical outcomes. The method's scalable, interpretable, and user-friendly implementation ensures broad applicability to a range of research questions. This MRF-based framework advances robust biomarker discovery and integrative multi-omics analyses, accelerating the translation of complex molecular data into tangible biological and clinical insights.

摘要

高通量技术如今产生了各种各样的组学数据,从基因组和转录组图谱到表观基因组和蛋白质组测量。整合这些不同的数据类型能够更深入地洞察驱动复杂性状和疾病的生物学机制。然而,从多个数据层中提取关键的共享生物标志物仍然是一项重大挑战。我们提出了一个基于多元随机森林(MRF)的框架,并通过一种新颖的逆最小深度(IMD)指标对其进行增强,用于综合变量选择。通过将响应变量分配给树节点并使用IMD对预测变量进行排序,我们的方法能够有效地识别不同组学类型中的关键特征,即使面对高维度和噪声也能如此。通过对来自癌症基因组图谱的多组学数据集进行广泛的模拟和分析,我们证明我们的方法在揭示具有生物学意义的生物标志物和通路方面优于现有的综合技术。我们的研究结果表明,所选的生物标志物不仅与已知的调控和信号网络相关,还能够对具有不同临床结果的患者亚组进行分层。该方法可扩展、可解释且用户友好的实现方式确保了其在一系列研究问题中的广泛适用性。这个基于MRF的框架推动了强大的生物标志物发现和综合多组学分析,加速了将复杂分子数据转化为切实的生物学和临床见解的过程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/34448fde4522/nihpp-2025.03.05.641533v1-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/b81b2a1c1d72/nihpp-2025.03.05.641533v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/423394033ae7/nihpp-2025.03.05.641533v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/a4b2097d8f9f/nihpp-2025.03.05.641533v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/d919ec6f44c4/nihpp-2025.03.05.641533v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/4f3ca7961678/nihpp-2025.03.05.641533v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/485a9dac4ae2/nihpp-2025.03.05.641533v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/8cdd0ed1ee13/nihpp-2025.03.05.641533v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/60a5de98a751/nihpp-2025.03.05.641533v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/34448fde4522/nihpp-2025.03.05.641533v1-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/b81b2a1c1d72/nihpp-2025.03.05.641533v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/423394033ae7/nihpp-2025.03.05.641533v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/a4b2097d8f9f/nihpp-2025.03.05.641533v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/d919ec6f44c4/nihpp-2025.03.05.641533v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/4f3ca7961678/nihpp-2025.03.05.641533v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/485a9dac4ae2/nihpp-2025.03.05.641533v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/8cdd0ed1ee13/nihpp-2025.03.05.641533v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/60a5de98a751/nihpp-2025.03.05.641533v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ef/11908250/34448fde4522/nihpp-2025.03.05.641533v1-f0009.jpg

相似文献

1
An Integrative Multi-Omics Random Forest Framework for Robust Biomarker Discovery.一种用于稳健生物标志物发现的综合多组学随机森林框架。
bioRxiv. 2025 Mar 6:2025.03.05.641533. doi: 10.1101/2025.03.05.641533.
2
DeepOmix: A scalable and interpretable multi-omics deep learning framework and application in cancer survival analysis.深度混合模型(DeepOmix):一种可扩展且可解释的多组学深度学习框架及其在癌症生存分析中的应用。
Comput Struct Biotechnol J. 2021 May 1;19:2719-2725. doi: 10.1016/j.csbj.2021.04.067. eCollection 2021.
3
Machine learning combining multi-omics data and network algorithms identifies adrenocortical carcinoma prognostic biomarkers.结合多组学数据和网络算法的机器学习可识别肾上腺皮质癌预后生物标志物。
Front Mol Biosci. 2023 Nov 6;10:1258902. doi: 10.3389/fmolb.2023.1258902. eCollection 2023.
4
PathwayMultiomics: An R Package for Efficient Integrative Analysis of Multi-Omics Datasets With Matched or Un-matched Samples.PathwayMultiomics:一个用于对具有匹配或不匹配样本的多组学数据集进行高效综合分析的R包。
Front Genet. 2021 Dec 22;12:783713. doi: 10.3389/fgene.2021.783713. eCollection 2021.
5
Integrative Analysis of Multi-omics Data for Discovery and Functional Studies of Complex Human Diseases.用于复杂人类疾病发现和功能研究的多组学数据综合分析
Adv Genet. 2016;93:147-90. doi: 10.1016/bs.adgen.2015.11.004. Epub 2016 Jan 25.
6
Holomics - a user-friendly R shiny application for multi-omics data integration and analysis.Holomics - 一个用户友好的 R shiny 应用程序,用于多组学数据集成和分析。
BMC Bioinformatics. 2024 Mar 4;25(1):93. doi: 10.1186/s12859-024-05719-4.
7
Multi-omics data fusion using adaptive GTO guided Non-negative matrix factorization for cancer subtype discovery.使用自适应广义张量正交分解引导的非负矩阵分解进行癌症亚型发现的多组学数据融合
Comput Methods Programs Biomed. 2023 Jan;228:107246. doi: 10.1016/j.cmpb.2022.107246. Epub 2022 Nov 16.
8
NetMIM: network-based multi-omics integration with block missingness for biomarker selection and disease outcome prediction.NetMIM:基于网络的多组学整合,具有块缺失,用于生物标志物选择和疾病结果预测。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae454.
9
From Omics to Multi-Omics Approaches for In-Depth Analysis of the Molecular Mechanisms of Prostate Cancer.从组学到多组学方法深入分析前列腺癌的分子机制。
Int J Mol Sci. 2022 Jun 3;23(11):6281. doi: 10.3390/ijms23116281.
10
Adaptive Sparse Multi-Block PLS Discriminant Analysis: An Integrative Method for Identifying Key Biomarkers from Multi-Omics Data.自适应稀疏多块偏最小二乘判别分析:一种从多组学数据中识别关键生物标志物的综合方法。
Genes (Basel). 2023 Apr 23;14(5):961. doi: 10.3390/genes14050961.

本文引用的文献

1
MMP-11 expression in early luminal breast cancer: associations with clinical, MRI, pathological characteristics, and disease-free survival.早期管腔型乳腺癌中 MMP-11 的表达:与临床、MRI、病理特征及无病生存的相关性。
BMC Cancer. 2024 Mar 4;24(1):295. doi: 10.1186/s12885-024-11998-0.
2
Large-scale microbiome data integration enables robust biomarker identification.大规模微生物组数据整合有助于可靠地识别生物标志物。
Nat Comput Sci. 2022 May;2(5):307-316. doi: 10.1038/s43588-022-00247-8. Epub 2022 May 23.
3
Game-theoretic link relevance indexing on genome-wide expression dataset identifies putative salient genes with potential etiological and diapeutics role in colorectal cancer.
基于全基因组表达数据集的博弈论链接相关性索引确定了具有潜在病因学和治疗作用的结直肠癌潜在显著基因。
Sci Rep. 2022 Aug 4;12(1):13409. doi: 10.1038/s41598-022-17266-0.
4
ANGPTL1 attenuates cancer migration, invasion, and stemness through regulating FOXO3a-mediated SOX2 expression in colorectal cancer.ANGPTL1 通过调节 FOXO3a 介导的 SOX2 表达抑制结直肠癌的迁移、侵袭和干性。
Clin Sci (Lond). 2022 May 13;136(9):657-673. doi: 10.1042/CS20220043.
5
The estrogen receptor/GATA3/FOXA1 transcriptional network: lessons learned from breast cancer.雌激素受体/GATA3/FOXA1 转录网络:从乳腺癌中得到的启示。
Curr Opin Struct Biol. 2021 Dec;71:65-70. doi: 10.1016/j.sbi.2021.05.015. Epub 2021 Jul 2.
6
MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification.MOGONET 通过使用图卷积网络整合多组学数据,从而实现患者分类和生物标志物识别。
Nat Commun. 2021 Jun 8;12(1):3445. doi: 10.1038/s41467-021-23774-w.
7
GABRP sustains the stemness of triple-negative breast cancer cells through EGFR signaling.γ-氨基丁酸A型受体ρ亚基(GABRP)通过表皮生长因子受体(EGFR)信号传导维持三阴性乳腺癌细胞的干性。
Cancer Lett. 2021 Aug 28;514:90-102. doi: 10.1016/j.canlet.2021.04.028. Epub 2021 May 21.
8
MMP11 promotes the proliferation and progression of breast cancer through stabilizing Smad2 protein.MMP11 通过稳定 Smad2 蛋白促进乳腺癌的增殖和进展。
Oncol Rep. 2021 Apr;45(4). doi: 10.3892/or.2021.7967. Epub 2021 Mar 2.
9
Visualizing and interpreting cancer genomics data via the Xena platform.通过Xena平台可视化和解读癌症基因组学数据。
Nat Biotechnol. 2020 Jun;38(6):675-678. doi: 10.1038/s41587-020-0546-8.
10
Identification of Breast Cancer Subtype Specific MicroRNAs Using Survival Analysis to Find Their Role in Transcriptomic Regulation.利用生存分析鉴定乳腺癌亚型特异性微小RNA以发现其在转录组调控中的作用
Front Genet. 2019 Oct 31;10:1047. doi: 10.3389/fgene.2019.01047. eCollection 2019.