Wu Zejun, Min Congcong, Cao Wen, Xue Feiyang, Wu Xiaohong, Yang Yanbo, Yang Jianye, Niu Xiaohui, Gong Jing
Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430074, China.
College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, China.
Int J Mol Sci. 2025 Mar 20;26(6):2806. doi: 10.3390/ijms26062806.
The identification of cancer prognostic biomarkers is crucial for predicting disease progression, optimizing personalized therapies, and improving patient survival. Molecular biomarkers are increasingly being identified for cancer prognosis estimation. However, existing studies and databases often focus on single-type molecular biomarkers, deficient in comprehensive multi-omics data integration, which constrains the comprehensive exploration of biomarkers and underlying mechanisms. To fill this gap, we conducted a systematic prognostic analysis using over 10,000 samples across 33 cancer types from The Cancer Genome Atlas (TCGA). Our study integrated nine types of molecular biomarker-related data: single-nucleotide polymorphism (SNP), copy number variation (CNV), alternative splicing (AS), alternative polyadenylation (APA), coding gene expression, DNA methylation, lncRNA expression, miRNA expression, and protein expression. Using log-rank tests, univariate Cox regression (uni-Cox), and multivariate Cox regression (multi-Cox), we evaluated potential biomarkers associated with four clinical outcome endpoints: overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). As a result, we identified 4,498,523 molecular biomarkers significantly associated with cancer prognosis. Finally, we developed SurvDB, an interactive online database for data retrieval, visualization, and download, providing a comprehensive resource for biomarker discovery and precision oncology research.
癌症预后生物标志物的识别对于预测疾病进展、优化个性化治疗以及提高患者生存率至关重要。越来越多的分子生物标志物被用于癌症预后评估。然而,现有研究和数据库往往侧重于单一类型的分子生物标志物,缺乏全面的多组学数据整合,这限制了对生物标志物及其潜在机制的全面探索。为了填补这一空白,我们使用来自癌症基因组图谱(TCGA)的33种癌症类型的10000多个样本进行了系统的预后分析。我们的研究整合了九种与分子生物标志物相关的数据类型:单核苷酸多态性(SNP)、拷贝数变异(CNV)、可变剪接(AS)、可变多聚腺苷酸化(APA)、编码基因表达、DNA甲基化、长链非编码RNA(lncRNA)表达、微小RNA(miRNA)表达和蛋白质表达。使用对数秩检验、单变量Cox回归(uni-Cox)和多变量Cox回归(multi-Cox),我们评估了与四个临床结局终点相关的潜在生物标志物:总生存期(OS)、疾病特异性生存期(DSS)、无病间期(DFI)和无进展生存期(PFI)。结果,我们鉴定出4498523个与癌症预后显著相关的分子生物标志物。最后,我们开发了SurvDB,这是一个用于数据检索、可视化和下载的交互式在线数据库,为生物标志物发现和精准肿瘤学研究提供了全面的资源。