• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用交叉验证评估基于高维数据的生存风险分类器的预测准确性。

Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data.

机构信息

Biometric Research Branch, US National Cancer Institute, Bethesda, MD 20892-7434, USA.

出版信息

Brief Bioinform. 2011 May;12(3):203-14. doi: 10.1093/bib/bbr001. Epub 2011 Feb 15.

DOI:10.1093/bib/bbr001
PMID:21324971
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3105299/
Abstract

Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell's concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.

摘要

全基因组生物技术的发展激发了统计学对预测方法的关注。我们在这里回顾了将患者分类为生存风险组的方法,并使用交叉验证来评估此类分类。生存风险模型的判别措施包括生存曲线的分离、时间依赖性 ROC 曲线和 Harrell 的一致性指数。然而,对于高维数据应用,在用于模型开发的数据上计算这些措施作为重新替代统计数据会导致高度有偏的估计。用于高维数据生存风险建模的方法学的大多数发展都利用了单独的测试数据集来评估模型。交叉验证有时用于调整参数的优化。然而,在许多应用中,可用的数据太少,无法有效地分为训练集和测试集,因此作者通常要么报告重新替代统计数据,要么使用二进制分类方法分析其数据,以便利用熟悉的交叉验证。在本文中,我们试图指出如何利用交叉验证来评估生存风险模型;具体来说,如何计算预测风险组的交叉验证估计生存分布,以及如何计算交叉验证时间依赖性 ROC 曲线。我们还讨论了生存风险模型的统计显著性评估,以及评估高维基因组数据是否仅基于标准协变量为模型增加预测准确性。

相似文献

1
Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data.使用交叉验证评估基于高维数据的生存风险分类器的预测准确性。
Brief Bioinform. 2011 May;12(3):203-14. doi: 10.1093/bib/bbr001. Epub 2011 Feb 15.
2
External Validation and Optimization of the SPRING Model for Prediction of Survival After Surgical Treatment of Bone Metastases of the Extremities.四肢骨转移手术治疗后生存预测的 SPRING 模型的外部验证和优化。
Clin Orthop Relat Res. 2018 Aug;476(8):1591-1599. doi: 10.1097/01.blo.0000534678.44152.ee.
3
Review and evaluation of performance measures for survival prediction models in external validation settings.外部验证环境下生存预测模型性能指标的回顾与评估
BMC Med Res Methodol. 2017 Apr 18;17(1):60. doi: 10.1186/s12874-017-0336-2.
4
A time-dependent discrimination index for survival data.生存数据的时间依赖性判别指数。
Stat Med. 2005 Dec 30;24(24):3927-44. doi: 10.1002/sim.2427.
5
Prognostic ROC curves: a method for representing the overall discriminative capacity of binary markers with right-censored time-to-event endpoints.预后 ROC 曲线:一种用于表示具有右删失时间事件终点的二分类标志物整体判别能力的方法。
Epidemiology. 2014 Jan;25(1):103-9. doi: 10.1097/EDE.0000000000000004.
6
Development and validation of epithelial mesenchymal transition-related prognostic model for hepatocellular carcinoma.上皮间质转化相关的肝细胞癌预后模型的建立和验证。
Aging (Albany NY). 2021 Apr 30;13(10):13822-13845. doi: 10.18632/aging.202976.
7
Survival model predictive accuracy and ROC curves.生存模型预测准确性和ROC曲线。
Biometrics. 2005 Mar;61(1):92-105. doi: 10.1111/j.0006-341X.2005.030814.x.
8
A new nomogram and risk classification system for predicting survival in small cell lung cancer patients diagnosed with brain metastasis: a large population-based study.一个用于预测小细胞肺癌脑转移患者生存的新列线图和风险分类系统:一项大型基于人群的研究。
BMC Cancer. 2021 May 29;21(1):640. doi: 10.1186/s12885-021-08384-5.
9
Discovery and validation of novel expression signature for postcystectomy recurrence in high-risk bladder cancer.高危膀胱癌膀胱切除术后复发新表达特征的发现与验证
J Natl Cancer Inst. 2014 Oct 24;106(11). doi: 10.1093/jnci/dju290. Print 2014 Nov.
10
Prognostic nomogram predicts overall survival in pulmonary large cell neuroendocrine carcinoma.预后列线图预测肺大细胞神经内分泌癌的总生存期。
PLoS One. 2019 Sep 27;14(9):e0223275. doi: 10.1371/journal.pone.0223275. eCollection 2019.

引用本文的文献

1
Multimodal analysis of cell-free DNA enhances differentiation of early-stage breast cancer from benign lesions and healthy individuals.游离DNA的多模态分析增强了早期乳腺癌与良性病变及健康个体之间的鉴别能力。
BMC Biol. 2025 Aug 20;23(1):259. doi: 10.1186/s12915-025-02371-z.
2
Enhancing the Analysis of Rheological Behavior in Clinker-Aided Cementitious Systems Through Large Language Model-Based Synthetic Data Generation.通过基于大语言模型的合成数据生成增强熟料辅助胶凝体系流变行为分析
Materials (Basel). 2025 Jul 30;18(15):3579. doi: 10.3390/ma18153579.
3
Peritumoral Radiomic Features on CT for Differential Diagnosis in Small-Cell Lung Cancer: Potential for Surgical Decision-Making.CT上小细胞肺癌鉴别诊断的瘤周放射组学特征:手术决策的潜力
Cancer Control. 2025 Jan-Dec;32:10732748251351754. doi: 10.1177/10732748251351754. Epub 2025 Jun 16.
4
Estimating the Risk of Lower Extremity Complications in Adults Newly Diagnosed With Diabetic Polyneuropathy: Retrospective Cohort Study.新诊断为糖尿病性多发性神经病的成人下肢并发症风险评估:回顾性队列研究
JMIR Diabetes. 2025 May 29;10:e60141. doi: 10.2196/60141.
5
Artificial intelligence-based virtual staining platform for identifying tumor-associated macrophages from hematoxylin and eosin-stained images.基于人工智能的虚拟染色平台,用于从苏木精和伊红染色图像中识别肿瘤相关巨噬细胞。
Eur J Cancer. 2025 May 2;220:115390. doi: 10.1016/j.ejca.2025.115390. Epub 2025 Mar 26.
6
Deciphering the intratumoral histologic heterogeneity of lung adenocarcinoma using radiomics.利用放射组学解析肺腺癌的瘤内组织学异质性
Eur Radiol. 2025 Feb 12. doi: 10.1007/s00330-025-11397-4.
7
Bioinformatics analysis of PSAT1 loss identifies downstream pathways regulated in EGFR mutant NSCLC and a selective gene signature for predicting the risk of relapse.PSAT1缺失的生物信息学分析确定了EGFR突变型非小细胞肺癌中受调控的下游通路以及用于预测复发风险的选择性基因特征。
Oncol Lett. 2024 Oct 17;29(1):9. doi: 10.3892/ol.2024.14755. eCollection 2025 Jan.
8
Circulating cell-free and extracellular vesicles-derived microRNA as prognostic biomarkers in patients with early-stage NSCLC: results from RESTING study.循环无细胞和细胞外囊泡衍生的 microRNA 作为早期 NSCLC 患者的预后生物标志物:RESTING 研究的结果。
J Exp Clin Cancer Res. 2024 Aug 22;43(1):241. doi: 10.1186/s13046-024-03156-y.
9
Risk Factors and Nomogram Model for Hepatocellular Carcinoma Development in Chronic Hepatitis B Patients with Low-Level Viremia.低病毒载量慢性乙型肝炎患者发生肝细胞癌的危险因素和列线图模型。
Int J Med Sci. 2024 Jun 17;21(9):1661-1671. doi: 10.7150/ijms.95861. eCollection 2024.
10
C-reactive protein as robust laboratory value associated with prognosis in patients with stage III non-small cell lung cancer (NSCLC) treated with definitive radiochemotherapy.C-反应蛋白作为与接受根治性放化疗的 III 期非小细胞肺癌(NSCLC)患者预后相关的稳健实验室指标。
Sci Rep. 2024 Jun 14;14(1):13765. doi: 10.1038/s41598-024-64302-2.

本文引用的文献

1
An evaluation of resampling methods for assessment of survival risk prediction in high-dimensional settings.高维环境下评估生存风险预测的重采样方法评估。
Stat Med. 2011 Mar 15;30(6):642-53. doi: 10.1002/sim.4106. Epub 2010 Dec 1.
2
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.
3
Gene expression-based prognostic signatures in lung cancer: ready for clinical use?基于基因表达的肺癌预后标志物:是否准备好用于临床?
J Natl Cancer Inst. 2010 Apr 7;102(7):464-74. doi: 10.1093/jnci/djq025. Epub 2010 Mar 16.
4
Testing the additional predictive value of high-dimensional molecular data.测试高维分子数据的额外预测价值。
BMC Bioinformatics. 2010 Feb 8;11:78. doi: 10.1186/1471-2105-11-78.
5
Survival prediction from clinico-genomic models--a comparative study.基于临床基因组模型的生存预测——一项对比研究。
BMC Bioinformatics. 2009 Dec 13;10:413. doi: 10.1186/1471-2105-10-413.
6
Analysis of gene expression data using BRB-ArrayTools.使用BRB-ArrayTools分析基因表达数据。
Cancer Inform. 2007 Feb 4;3:11-7.
7
Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study.基于基因表达的肺腺癌生存预测:一项多中心、盲法验证研究。
Nat Med. 2008 Aug;14(8):822-7. doi: 10.1038/nm.1790. Epub 2008 Jul 20.
8
Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models.在稀疏高维生存模型的提升估计中考虑强制协变量。
BMC Bioinformatics. 2008 Jan 10;9:14. doi: 10.1186/1471-2105-9-14.
9
Predicting survival from microarray data--a comparative study.从微阵列数据预测生存率——一项比较研究。
Bioinformatics. 2007 Aug 15;23(16):2080-7. doi: 10.1093/bioinformatics/btm305. Epub 2007 Jun 6.
10
Assessment of survival prediction models based on microarray data.基于微阵列数据的生存预测模型评估。
Bioinformatics. 2007 Jul 15;23(14):1768-74. doi: 10.1093/bioinformatics/btm232. Epub 2007 May 7.