呼吁在评估组学数据的预测价值时，考虑所有可用的临床信息。

A plea for taking all available clinical information into account when assessing the predictive value of omics data.

机构信息

Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, 81377, Germany.

Chair of Statistics, School of Business and Economics, Humboldt-Universität zu Berlin, Spandauer Straße 1, Berlin, 10178, Germany.

出版信息

BMC Med Res Methodol. 2019 Jul 24;19(1):162. doi: 10.1186/s12874-019-0802-0.

DOI:10.1186/s12874-019-0802-0

PMID:31340753

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6657034/

Abstract

BACKGROUND

Omics data can be very informative in survival analysis and may improve the prognostic ability of classical models based on clinical risk factors for various diseases, for example breast cancer. Recent research has focused on integrating omics and clinical data, yet has often ignored the need for appropriate model building for clinical variables. Medical literature on classical prognostic scores, as well as biostatistical literature on appropriate model selection strategies for low dimensional (clinical) data, are often ignored in the context of omics research. The goal of this paper is to fill this methodological gap by investigating the added predictive value of gene expression data for models using varying amounts of clinical information.

METHODS

We analyze two data sets from the field of survival prognosis of breast cancer patients. First, we construct several proportional hazards prediction models using varying amounts of clinical information based on established medical knowledge. These models are then used as a starting point (i.e. included as a clinical offset) for identifying informative gene expression variables using resampling procedures and penalized regression approaches (model based boosting and the LASSO). In order to assess the added predictive value of the gene signatures, measures of prediction accuracy and separation are examined on a validation data set for the clinical models and the models that combine the two sources of information.

RESULTS

For one data set, we do not find any substantial added predictive value of the omics data when compared to clinical models. On the second data set, we identify a noticeable added predictive value, however only for scenarios where little or no clinical information is included in the modeling process. We find that including more clinical information can lead to a smaller number of selected omics predictors.

CONCLUSIONS

New research using omics data should include all available established medical knowledge in order to allow an adequate evaluation of the added predictive value of omics data. Including all relevant clinical information in the analysis might also lead to more parsimonious models. The developed procedure to assess the predictive value of the omics data can be readily applied to other scenarios.

摘要

背景

组学数据在生存分析中非常有用，并且可以提高基于临床危险因素的各种疾病（例如乳腺癌）的经典模型的预后能力。最近的研究集中在整合组学和临床数据上，但经常忽略了对临床变量进行适当模型构建的需求。在组学研究中，经常忽略医学文献中的经典预后评分以及生物统计学文献中关于低维（临床）数据的适当模型选择策略。本文的目的是通过研究使用不同数量的临床信息的基因表达数据对模型的预测价值来填补这一方法上的空白。

方法

我们分析了乳腺癌患者生存预后领域的两个数据集。首先，我们根据已建立的医学知识，使用不同数量的临床信息构建了几个比例风险预测模型。然后，我们使用重采样程序和惩罚回归方法（基于模型的提升和 LASSO），将这些模型作为识别信息丰富的基因表达变量的起点（即包含作为临床偏移量）。为了评估基因特征的附加预测价值，我们在临床模型和结合两种信息源的模型的验证数据集上检查了预测准确性和分离的度量。

结果

对于一个数据集，与临床模型相比，我们没有发现组学数据的任何实质性附加预测价值。在第二个数据集上，我们发现了明显的附加预测价值，但是仅在建模过程中包含很少或没有临床信息的情况下。我们发现，包含更多的临床信息会导致选择的组学预测因子数量减少。

结论

使用组学数据的新研究应该包括所有可用的已建立的医学知识，以便能够充分评估组学数据的附加预测价值。在分析中包含所有相关的临床信息也可能导致更简洁的模型。开发的评估组学数据预测价值的程序可以很容易地应用于其他情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0e0/6657034/5b91d3cae95f/12874_2019_802_Fig1_HTML.jpg

相似文献

A plea for taking all available clinical information into account when assessing the predictive value of omics data.呼吁在评估组学数据的预测价值时，考虑所有可用的临床信息。

BMC Med Res Methodol. 2019 Jul 24;19(1):162. doi: 10.1186/s12874-019-0802-0.

Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。

J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.

Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction.多组学技术助力Cox回归模型中的变量选择以进行癌症预后预测。

Methods. 2017 Jul 15;124:100-107. doi: 10.1016/j.ymeth.2017.06.010. Epub 2017 Jun 13.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials.在随机临床试验中，通过生物标志物与治疗的相互作用，从高维Cox模型中稳健估计预期生存概率。

BMC Med Res Methodol. 2017 May 22;17(1):83. doi: 10.1186/s12874-017-0354-0.

High-dimensional Cox models: the choice of penalty as part of the model building process.高维Cox模型：作为模型构建过程一部分的惩罚项选择

Biom J. 2010 Feb;52(1):50-69. doi: 10.1002/bimj.200900064.

Pan-cancer evaluation of gene expression and somatic alteration data for cancer prognosis prediction.泛癌种评估基因表达和体细胞改变数据以预测癌症预后。

BMC Cancer. 2021 Sep 25;21(1):1053. doi: 10.1186/s12885-021-08796-3.

Investigating the prediction ability of survival models based on both clinical and omics data: two case studies.基于临床和组学数据研究生存模型的预测能力：两个案例研究

Stat Med. 2014 Dec 30;33(30):5310-29. doi: 10.1002/sim.6246. Epub 2014 Jul 9.

Mixture classification model based on clinical markers for breast cancer prognosis.基于临床标志物的乳腺癌预后混合分类模型。

Artif Intell Med. 2010 Feb-Mar;48(2-3):129-37. doi: 10.1016/j.artmed.2009.07.008. Epub 2009 Dec 14.

Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.基于深度学习的多组学生物标志物数据特征层融合在乳腺癌患者生存分析中的应用。

BMC Med Inform Decis Mak. 2020 Sep 15;20(1):225. doi: 10.1186/s12911-020-01225-8.

引用本文的文献

Evaluation of prognostic models to improve prediction of metastasis in patients following potentially curative treatment for primary colorectal cancer: the PROSPECT trial.评估用于改善原发性结直肠癌潜在治愈性治疗后患者转移预测的预后模型：PROSPECT试验

Health Technol Assess. 2025 Apr;29(8):1-91. doi: 10.3310/BTMT7049.

A Weibull mixture cure frailty model for high-dimensional covariates.一种用于高维协变量的威布尔混合治愈脆弱模型。

Stat Methods Med Res. 2025 Jun;34(6):1192-1218. doi: 10.1177/09622802251327687. Epub 2025 Mar 31.

The current state of imaging biomarker development and evaluation.成像生物标志物的开发与评估现状。

Br J Radiol. 2025 Jul 1;98(1171):981-986. doi: 10.1093/bjr/tqaf027.

Does combining numerous data types in multi-omics data improve or hinder performance in survival prediction? Insights from a large-scale benchmark study.在多组学数据中结合多种数据类型是否会提高或降低生存预测的性能？来自大规模基准研究的见解。

BMC Med Inform Decis Mak. 2024 Sep 2;24(1):244. doi: 10.1186/s12911-024-02642-9.

Multivariable prognostic modelling to improve prediction of colorectal cancer recurrence: the PROSPeCT trial.多变量预后建模以提高结直肠癌复发预测：PROSPeCT 试验。

Eur Radiol. 2024 Nov;34(11):6992-7001. doi: 10.1007/s00330-024-10803-7. Epub 2024 Jun 5.

Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges.高维生物医学数据的统计分析：分析目标、常见方法和挑战简介。

BMC Med. 2023 May 15;21(1):182. doi: 10.1186/s12916-023-02858-y.

Optimal microRNA Sequencing Depth to Predict Cancer Patient Survival with Random Forest and Cox Models.随机森林和 Cox 模型预测癌症患者生存的最优 microRNA 测序深度。

Genes (Basel). 2022 Dec 2;13(12):2275. doi: 10.3390/genes13122275.

Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening.具有肿瘤特征分析的套索惩罚 Cox 模型的预后可提高预测准确性，优于仅使用临床数据的预测，并且受益于二维预筛选。

BMC Cancer. 2022 Oct 5;22(1):1045. doi: 10.1186/s12885-022-10117-1.

Ten quick tips for biomarker discovery and validation analyses using machine learning.使用机器学习进行生物标志物发现与验证分析的十条快速提示。

PLoS Comput Biol. 2022 Aug 11;18(8):e1010357. doi: 10.1371/journal.pcbi.1010357. eCollection 2022 Aug.

Challenges in translational machine learning.转化机器学习中的挑战。

Hum Genet. 2022 Sep;141(9):1451-1466. doi: 10.1007/s00439-022-02439-8. Epub 2022 Mar 4.

本文引用的文献

Quantifying the added value of new biomarkers: how and how not.量化新生物标志物的附加价值：方法与误区

Diagn Progn Res. 2018 Jul 11;2:14. doi: 10.1186/s41512-018-0037-2. eCollection 2018.

Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies.迈向基于证据的计算统计学：从临床研究中汲取关于真实数据基准研究作用和设计的经验教训。

BMC Med Res Methodol. 2017 Sep 9;17(1):138. doi: 10.1186/s12874-017-0417-2.

Review and evaluation of performance measures for survival prediction models in external validation settings.外部验证环境下生存预测模型性能指标的回顾与评估

BMC Med Res Methodol. 2017 Apr 18;17(1):60. doi: 10.1186/s12874-017-0336-2.

Prognostic value of cross-omics screening for kidney clear cell renal cancer survival.跨组学筛查对肾透明细胞癌生存的预后价值。

Biol Direct. 2016 Dec 20;11(1):68. doi: 10.1186/s13062-016-0170-1.

Toward a Shared Vision for Cancer Genomic Data.迈向癌症基因组数据的共同愿景。

N Engl J Med. 2016 Sep 22;375(12):1109-12. doi: 10.1056/NEJMp1607591.

Increased Proportion of Variance Explained and Prediction Accuracy of Survival of Breast Cancer Patients with Use of Whole-Genome Multiomic Profiles.利用全基因组多组学图谱解释的乳腺癌患者生存方差比例增加及预测准确性提高。

Genetics. 2016 Jul;203(3):1425-38. doi: 10.1534/genetics.115.185181. Epub 2016 Apr 29.

Improving the Prognostic Ability through Better Use of Standard Clinical Data - The Nottingham Prognostic Index as an Example.通过更好地利用标准临床数据提高预后预测能力——以诺丁汉预后指数为例

PLoS One. 2016 Mar 3;11(3):e0149977. doi: 10.1371/journal.pone.0149977. eCollection 2016.

On stability issues in deriving multivariable regression models.关于推导多变量回归模型中的稳定性问题。

Biom J. 2015 Jul;57(4):531-55. doi: 10.1002/bimj.201300222. Epub 2014 Dec 15.

Added predictive value of omics data: specific issues related to validation illustrated by two case studies.组学数据的附加预测价值：通过两个案例研究说明的与验证相关的具体问题。

BMC Med Res Methodol. 2014 Oct 28;14:117. doi: 10.1186/1471-2288-14-117.

Evaluating Random Forests for Survival Analysis using Prediction Error Curves.使用预测误差曲线评估随机森林用于生存分析

J Stat Softw. 2012 Sep;50(11):1-23. doi: 10.18637/jss.v050.i11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

呼吁在评估组学数据的预测价值时，考虑所有可用的临床信息。

A plea for taking all available clinical information into account when assessing the predictive value of omics data.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献