Suppr超能文献

临床肿瘤学研究中的个性化风险预测:使用生存树和随机森林的应用及实际问题

Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests.

作者信息

Hu Chen, Steingrimsson Jon Arni

机构信息

a Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center , Johns Hopkins University School of Medicine , Baltimore , MD , USA.

b Department of Biostatistics , School of Public Health, Brown University , Providence , RI , USA.

出版信息

J Biopharm Stat. 2018;28(2):333-349. doi: 10.1080/10543406.2017.1377730. Epub 2017 Oct 19.

Abstract

A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.

摘要

做出个性化治疗决策的一个关键组成部分是准确预测每个患者的疾病风险。在临床肿瘤学中,疾病风险通常通过事件发生时间数据来衡量,如总生存期和无进展/无复发生存期,并且常常受到删失的影响。基于递归划分方法的风险预测模型越来越受欢迎,主要是因为它们能够处理非线性关系、高阶相互作用和/或高维协变量。最流行的递归划分方法是分类与回归树(CART)算法的变体,它构建了一个简单可解释的树状结构模型。为了提高预测准确性,随机森林算法对多个CART树进行平均,创建了一个灵活的风险预测模型。临床肿瘤学中使用的风险预测模型通常既使用传统的人口统计学和肿瘤病理因素,也使用来自多模态治疗的高维基因标记和治疗参数。在本文中,我们描述了CART和随机森林算法针对右删失结局最常用的扩展。我们关注它们与针对非删失结局的方法有何不同,以及不同的分裂规则和成本复杂度剪枝方法如何影响这些算法。我们通过分析一项乳腺癌随机III期临床试验来演示这些算法。我们还进行了蒙特卡罗模拟,以比较生存森林与各种情况下更常用的回归模型的预测准确性。这些模拟研究旨在评估预测准确性对基础模型规范、调优参数的选择以及协变量缺失程度的敏感程度。

相似文献

3
Survival forests for data with dependent censoring.带有相依删失数据的生存森林。
Stat Methods Med Res. 2019 Feb;28(2):445-461. doi: 10.1177/0962280217727314. Epub 2017 Aug 24.
9
Survival trees for interval-censored survival data.区间删失生存数据的生存树
Stat Med. 2017 Dec 30;36(30):4831-4842. doi: 10.1002/sim.7450. Epub 2017 Aug 18.
10
Impact of censoring on learning Bayesian networks in survival modelling.生存模型中删失数据对贝叶斯网络学习的影响。
Artif Intell Med. 2009 Nov;47(3):199-217. doi: 10.1016/j.artmed.2009.08.001. Epub 2009 Oct 14.

引用本文的文献

3
Artificial intelligence methods available for cancer research.人工智能方法可用于癌症研究。
Front Med. 2024 Oct;18(5):778-797. doi: 10.1007/s11684-024-1085-3. Epub 2024 Aug 8.
7
High-Dimensional Survival Analysis: Methods and Applications.高维生存分析:方法与应用
Annu Rev Stat Appl. 2023 Mar;10(1):25-49. doi: 10.1146/annurev-statistics-032921-022127. Epub 2022 Oct 6.

本文引用的文献

3
Doubly robust survival trees.双重稳健生存树
Stat Med. 2016 Sep 10;35(20):3595-612. doi: 10.1002/sim.6949. Epub 2016 Mar 31.
4
Comparison of splitting methods on survival tree.生存树上分裂方法的比较
Int J Biostat. 2015 May;11(1):175-88. doi: 10.1515/ijb-2014-0029.
6
Random survival forests for competing risks.用于竞争风险的随机生存森林
Biostatistics. 2014 Oct;15(4):757-73. doi: 10.1093/biostatistics/kxu010. Epub 2014 Apr 11.
7
A random forest approach for competing risks based on pseudo-values.基于伪值的竞争风险随机森林方法。
Stat Med. 2013 Aug 15;32(18):3102-14. doi: 10.1002/sim.5775. Epub 2013 Mar 18.
8
Recursively Imputed Survival Trees.递归推算生存树
J Am Stat Assoc. 2012;107(497):331-340. doi: 10.1080/01621459.2011.637468. Epub 2011 Dec 6.
9
Random forests for genomic data analysis.随机森林在基因组数据分析中的应用。
Genomics. 2012 Jun;99(6):323-9. doi: 10.1016/j.ygeno.2012.04.003. Epub 2012 Apr 21.
10
Consistency of Random Survival Forests.随机生存森林的一致性
Stat Probab Lett. 2010 Jul 1;80(13-14):1056-1064. doi: 10.1016/j.spl.2010.02.020.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验