比较癌症预后预测的通路和基因水平模型。

Comparison of pathway and gene-level models for cancer prognosis prediction.

机构信息

Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.

Department of Medicine, Baylor College of Medicine, Institute for Clinical and Translational Research, 1 Baylor Plaza, Houston, TX, 77030, USA.

出版信息

BMC Bioinformatics. 2020 Feb 28;21(1):76. doi: 10.1186/s12859-020-3423-z.

DOI:10.1186/s12859-020-3423-z

PMID:32111152

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7048092/

Abstract

BACKGROUND

Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB).

RESULTS

When analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~ 0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data.

CONCLUSION

The results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency.

摘要

背景

癌症预后预测对患者和临床医生很有价值，因为它可以帮助他们更好地管理治疗。提高基于表达的预测模型性能和解释的有前途的方向涉及将基因水平的数据聚合到生物途径中。虽然许多研究已经使用途径水平的预测因子进行癌症生存分析，但尚未对途径水平和基因水平的预后模型进行全面比较。为了解决这一差距，我们对基于 TCGA 中分析的癌症和 MSigDB 中途径的途径水平或基因水平预测因子构建的惩罚 Cox 比例风险模型的性能进行了特征描述。

结果

在分析 TCGA 数据时，我们发现与具有相似预测性能的基因水平模型相比，途径水平模型更简洁、更稳健、更计算高效且更易于解释。例如，TCGA 神经胶质瘤队列中，途径水平和基因水平模型的平均 Cox 一致性指数均约为 0.85，但基因水平模型的预测因子平均数量是其两倍，预测因子组成在交叉验证折叠中不太稳定，估计时间比途径水平模型长 40 倍。当通过置换打破数据的复杂相关结构时，途径水平模型具有更好的预测性能，同时相对于基因水平模型，仍然保持卓越的解释能力、稳健性、简洁性和计算效率。例如，使用来自不相关基因表达数据模拟的生存时间，TCGA 神经胶质瘤队列中途径水平模型的平均一致性指数增加到 0.88，而基因水平模型则降至 0.56。

结论

本研究的结果表明，当基因表达值之间的相关性较低时，与基因水平模型相比，途径水平分析可以产生更好的预测性能、更强的解释能力、更稳健的模型和更低的计算成本。当基因之间的相关性较高时，与基因水平分析相比，途径水平分析提供等效的预测能力，同时保留了可解释性、稳健性和计算效率的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74e9/7048092/b5c88dacd32f/12859_2020_3423_Fig1_HTML.jpg

相似文献

Comparison of pathway and gene-level models for cancer prognosis prediction.比较癌症预后预测的通路和基因水平模型。

BMC Bioinformatics. 2020 Feb 28;21(1):76. doi: 10.1186/s12859-020-3423-z.

Pan-cancer evaluation of gene expression and somatic alteration data for cancer prognosis prediction.泛癌种评估基因表达和体细胞改变数据以预测癌症预后。

BMC Cancer. 2021 Sep 25;21(1):1053. doi: 10.1186/s12885-021-08796-3.

Cancer prognosis prediction using somatic point mutation and copy number variation data: a comparison of gene-level and pathway-based models.基于体细胞点突变和拷贝数变异数据的癌症预后预测：基因水平和通路水平模型的比较。

BMC Bioinformatics. 2020 Oct 20;21(1):467. doi: 10.1186/s12859-020-03791-0.

Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach.用于癌症生存预测的通路结构预测模型：一种两阶段方法。

Genetics. 2017 Jan;205(1):89-100. doi: 10.1534/genetics.116.189191. Epub 2016 Nov 9.

Identification of potential biomarkers related to glioma survival by gene expression profile analysis.通过基因表达谱分析鉴定与胶质瘤生存相关的潜在生物标志物。

BMC Med Genomics. 2019 Mar 20;11(Suppl 7):34. doi: 10.1186/s12920-019-0479-6.

BMC Cancer. 2021 Dec 7;21(1):1312. doi: 10.1186/s12885-021-08987-y.

A Risk Score Signature Consisting of Six Immune Genes Predicts Overall Survival in Patients with Lower-Grade Gliomas.由六个免疫基因组成的风险评分特征可预测低级别胶质瘤患者的总生存期。

Comput Math Methods Med. 2022 Feb 11;2022:2558548. doi: 10.1155/2022/2558548. eCollection 2022.

HiFreSP: A novel high-frequency sub-pathway mining approach to identify robust prognostic gene signatures.HiFreSP：一种新颖的高频子路径挖掘方法，用于识别稳健的预后基因特征。

Brief Bioinform. 2020 Jul 15;21(4):1411-1424. doi: 10.1093/bib/bbz078.

Identification and validation of a five-lncRNA prognostic signature related to Glioma using bioinformatics analysis.基于生物信息学分析鉴定和验证与Glioma 相关的五个长链非编码 RNA 预后标志物。

BMC Cancer. 2021 Mar 9;21(1):251. doi: 10.1186/s12885-021-07972-9.

A novel pyroptosis-related gene signature predicts the prognosis of glioma through immune infiltration.一种新的与细胞焦亡相关的基因特征通过免疫浸润预测胶质瘤的预后。

BMC Cancer. 2021 Dec 7;21(1):1311. doi: 10.1186/s12885-021-09046-2.

引用本文的文献

Development and validation of prognostic models based on cell cycle-related signatures for predicting the prognosis of patients with lung adenocarcinoma.基于细胞周期相关特征的预后模型的开发与验证，用于预测肺腺癌患者的预后

Transl Cancer Res. 2025 May 30;14(5):2900-2915. doi: 10.21037/tcr-24-1479. Epub 2025 May 27.

GRPa-PRS: A risk stratification method to identify genetically-regulated pathways in polygenic diseases.GRPa-PRS：一种用于识别多基因疾病中基因调控通路的风险分层方法。

medRxiv. 2024 Jul 5:2023.06.19.23291621. doi: 10.1101/2023.06.19.23291621.

Systematic assessment of prognostic molecular features across cancers.跨癌症的预后分子特征的系统评估。

Cell Genom. 2023 Feb 2;3(3):100262. doi: 10.1016/j.xgen.2023.100262. eCollection 2023 Mar 8.

BMC Genomics. 2022 May 4;22(Suppl 5):918. doi: 10.1186/s12864-022-08581-x.

SWAN pathway-network identification of common aneuploidy-based oncogenic drivers.SWAN 通路网络鉴定常见非整倍体相关致癌驱动因子。

Nucleic Acids Res. 2022 Apr 22;50(7):3673-3692. doi: 10.1093/nar/gkac200.

cSurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines.cSurvival：一个用于癌症结果和细胞系中生物标志物相互作用的网络资源。

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac090.

Latent Variables Capture Pathway-Level Points of Departure in High-Throughput Toxicogenomic Data.潜在变量捕获高通量毒理基因组学数据中的途径起始点。

Chem Res Toxicol. 2022 Apr 18;35(4):670-683. doi: 10.1021/acs.chemrestox.1c00444. Epub 2022 Mar 25.

Exploring Pathway-Based Group Lasso for Cancer Survival Analysis: A Special Case of Multi-Task Learning.探索基于通路的组套索回归用于癌症生存分析：多任务学习的一个特殊案例

Front Genet. 2021 Nov 29;12:771301. doi: 10.3389/fgene.2021.771301. eCollection 2021.

Pan-cancer analysis of pathway-based gene expression pattern at the individual level reveals biomarkers of clinical prognosis.个体水平基于通路的基因表达模式的泛癌分析揭示了临床预后的生物标志物。

Cell Rep Methods. 2021 Aug 23;1(4). doi: 10.1016/j.crmeth.2021.100050. Epub 2021 Jul 23.

Construction and Evaluation of Robust Interpretation Models for Breast Cancer Metastasis Prediction.构建稳健的乳腺癌转移预测解释模型及其评估。

IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1344-1353. doi: 10.1109/TCBB.2021.3120673. Epub 2022 Jun 3.

本文引用的文献

Invited Abstract.特邀摘要。

Genet Epidemiol. 2019 Oct;43(7):864-925. doi: 10.1002/gepi.22256. Epub 2019 Sep 23.

Precision Lasso: accounting for correlations and linear dependencies in high-dimensional genomic data.精准套索：在高维基因组数据中考虑相关性和线性依赖关系。

Bioinformatics. 2019 Apr 1;35(7):1181-1187. doi: 10.1093/bioinformatics/bty750.

A comprehensive analysis of prognosis prediction models based on pathway‑level, gene‑level and clinical information for glioblastoma.基于通路水平、基因水平和临床信息的胶质母细胞瘤预后预测模型的综合分析。

Int J Mol Med. 2018 Oct;42(4):1837-1846. doi: 10.3892/ijmm.2018.3765. Epub 2018 Jul 11.

Pathway aggregation for survival prediction via multiple kernel learning.通过多内核学习进行生存预测的途径聚合。

Stat Med. 2018 Jul 20;37(16):2501-2515. doi: 10.1002/sim.7681. Epub 2018 Apr 17.

An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics.TCGA 泛癌临床数据资源整合，推动高质量生存预后分析。

Cell. 2018 Apr 5;173(2):400-416.e11. doi: 10.1016/j.cell.2018.02.052.

De novo pathway-based biomarker identification.基于从头合成途径的生物标志物鉴定。

Nucleic Acids Res. 2017 Sep 19;45(16):e151. doi: 10.1093/nar/gkx642.

Comparison of Breast Cancer Molecular Features and Survival by African and European Ancestry in The Cancer Genome Atlas.《癌症基因组图谱》中非洲裔和欧洲裔人群乳腺癌分子特征与生存的比较。

JAMA Oncol. 2017 Dec 1;3(12):1654-1662. doi: 10.1001/jamaoncol.2017.0595.

Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach.用于癌症生存预测的通路结构预测模型：一种两阶段方法。

Genetics. 2017 Jan;205(1):89-100. doi: 10.1534/genetics.116.189191. Epub 2016 Nov 9.

Personalized medicine and cancer.个性化医学与癌症。

J Pers Med. 2012 Jan 30;2(1):1-14. doi: 10.3390/jpm2010001.

A novel model to combine clinical and pathway-based transcriptomic information for the prognosis prediction of breast cancer.一种结合临床和基于通路的转录组信息用于乳腺癌预后预测的新型模型。

PLoS Comput Biol. 2014 Sep 18;10(9):e1003851. doi: 10.1371/journal.pcbi.1003851. eCollection 2014 Sep.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

比较癌症预后预测的通路和基因水平模型。

Comparison of pathway and gene-level models for cancer prognosis prediction.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献