• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

加速且可解释的斜向随机生存森林

Accelerated and Interpretable Oblique Random Survival Forests.

作者信息

Jaeger Byron C, Welden Sawyer, Lenoir Kristin, Speiser Jaime L, Segar Matthew W, Pandey Ambarish, Pajewski Nicholas M

机构信息

Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston-Salem, NC.

Department of Cardiology, Texas Heart Institute, Houston, TX.

出版信息

J Comput Graph Stat. 2024;33(1):192-207. doi: 10.1080/10618600.2023.2231048.

DOI:10.1080/10618600.2023.2231048
PMID:39184344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11343578/
Abstract

The oblique random survival forest (RSF) is an ensemble supervised learning method for right-censored outcomes. Trees in the oblique RSF are grown using linear combinations of predictors, whereas in the standard RSF, a single predictor is used. Oblique RSF ensembles have high prediction accuracy, but assessing many linear combinations of predictors induces high computational overhead. In addition, few methods have been developed for estimation of variable importance (VI) with oblique RSFs. We introduce a method to increase computational efficiency of the oblique RSF and a method to estimate VI with the oblique RSF. Our computational approach uses Newton-Raphson scoring in each non-leaf node, We estimate VI by negating each coefficient used for a given predictor in linear combinations, and then computing the reduction in out-of-bag accuracy. In benchmarking experiments, we find our implementation of the oblique RSF is hundreds of times faster, with equivalent prediction accuracy, compared to existing software for oblique RSFs. We find in simulation studies that "negation VI" discriminates between relevant and irrelevant numeric predictors more accurately than permutation VI, Shapley VI, and a technique to measure VI using analysis of variance. All oblique RSF methods in the current study are available in the aorsf R package, and additional supplemental materials are available online.

摘要

斜向随机生存森林(RSF)是一种用于处理删失结局的集成监督学习方法。斜向RSF中的树是通过预测变量的线性组合来生长的,而在标准RSF中,只使用单个预测变量。斜向RSF集成具有较高的预测准确性,但评估预测变量的许多线性组合会带来较高的计算开销。此外,针对斜向RSF的变量重要性(VI)估计方法很少。我们介绍了一种提高斜向RSF计算效率的方法以及一种用斜向RSF估计VI的方法。我们的计算方法在每个非叶节点使用牛顿-拉弗森评分,我们通过对线性组合中给定预测变量使用的每个系数取反,然后计算袋外准确率的降低来估计VI。在基准实验中,我们发现与现有的斜向RSF软件相比,我们实现的斜向RSF快数百倍,且预测准确性相当。我们在模拟研究中发现,“取反VI”比排列VI、沙普利VI以及使用方差分析测量VI的技术能更准确地区分相关和不相关的数值预测变量。当前研究中的所有斜向RSF方法都可在aorsf R包中获取,并且在线提供了额外的补充材料。

相似文献

1
Accelerated and Interpretable Oblique Random Survival Forests.加速且可解释的斜向随机生存森林
J Comput Graph Stat. 2024;33(1):192-207. doi: 10.1080/10618600.2023.2231048.
2
A comparative study of forest methods for time-to-event data: variable selection and predictive performance.森林方法在生存时间数据中的比较研究:变量选择和预测性能。
BMC Med Res Methodol. 2021 Sep 25;21(1):193. doi: 10.1186/s12874-021-01386-8.
3
Random survival forests for dynamic predictions of a time-to-event outcome using a longitudinal biomarker.使用纵向生物标志物对事件时间结果进行动态预测的随机生存森林。
BMC Med Res Methodol. 2021 Oct 17;21(1):216. doi: 10.1186/s12874-021-01375-x.
4
Individual risk prediction: Comparing random forests with Cox proportional-hazards model by a simulation study.个体风险预测:通过模拟研究比较随机森林与 Cox 比例风险模型。
Biom J. 2023 Aug;65(6):e2100380. doi: 10.1002/bimj.202100380. Epub 2022 Sep 28.
5
Oblique and rotation double random forest.倾斜和旋转双重随机森林。
Neural Netw. 2022 Sep;153:496-517. doi: 10.1016/j.neunet.2022.06.012. Epub 2022 Jun 18.
6
Random generalized linear model: a highly accurate and interpretable ensemble predictor.随机广义线性模型:一种高度准确且可解释的集成预测器。
BMC Bioinformatics. 2013 Jan 16;14:5. doi: 10.1186/1471-2105-14-5.
7
OBLIQUE RANDOM SURVIVAL FORESTS.倾斜随机生存森林
Ann Appl Stat. 2019 Sep;13(3):1847-1883. doi: 10.1214/19-aoas1261. Epub 2019 Oct 17.
8
Prognosis prediction of extremity and trunk wall soft-tissue sarcomas treated with surgical resection with radiomic analysis based on random survival forest.基于随机生存森林的放射组学分析预测手术切除治疗肢体和躯干壁软组织肉瘤的预后。
Updates Surg. 2022 Feb;74(1):355-365. doi: 10.1007/s13304-021-01074-8. Epub 2021 May 18.
9
Novel head and neck cancer survival analysis approach: random survival forests versus Cox proportional hazards regression.新型头颈部癌症生存分析方法:随机生存森林与 Cox 比例风险回归。
Head Neck. 2012 Jan;34(1):50-8. doi: 10.1002/hed.21698. Epub 2011 Feb 14.
10
Random Survival Forest in practice: a method for modelling complex metabolomics data in time to event analysis.实践中的随机生存森林:一种在时间-事件分析中对复杂代谢组学数据进行建模的方法。
Int J Epidemiol. 2016 Oct;45(5):1406-1420. doi: 10.1093/ije/dyw145. Epub 2016 Sep 1.

引用本文的文献

1
External validation of the Oncotype DX breast cancer recurrence score nomogram and development and validation of a novel machine learning-based model to predict postoperative overall survival and guide adjuvant chemotherapy in ER positive, Her-2 negative breast cancer patients: a retrospective cohort study.Oncotype DX乳腺癌复发评分列线图的外部验证以及一种基于机器学习的新型模型的开发与验证,该模型用于预测雌激素受体(ER)阳性、人表皮生长因子受体2(Her-2)阴性乳腺癌患者的术后总生存期并指导辅助化疗:一项回顾性队列研究
Front Oncol. 2025 May 21;15:1586262. doi: 10.3389/fonc.2025.1586262. eCollection 2025.
2
External validation of a proprietary risk model for 1-year mortality in community-dwelling adults aged 65 years or older.针对65岁及以上社区居住成年人1年死亡率的专有风险模型的外部验证。
J Am Med Inform Assoc. 2025 Jul 1;32(7):1110-1119. doi: 10.1093/jamia/ocaf062.
3
A comparison of random forest variable selection methods for regression modeling of continuous outcomes.用于连续结果回归建模的随机森林变量选择方法比较
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf096.
4
Development and external validation of a machine learning-based model to predict postoperative recurrence in patients with duodenal adenocarcinoma: a multicenter, retrospective cohort study.基于机器学习的十二指肠腺癌患者术后复发预测模型的开发与外部验证:一项多中心回顾性队列研究
BMC Med. 2025 Feb 21;23(1):98. doi: 10.1186/s12916-025-03912-7.
5
Accounting for racial bias and social determinants of health in a model of hypertension control.在高血压控制模型中考虑种族偏见和健康的社会决定因素。
BMC Med Inform Decis Mak. 2025 Feb 3;25(1):53. doi: 10.1186/s12911-025-02873-4.
6
Developing a prediction model for cognitive impairment in older adults following critical illness.开发一种针对危重病后老年患者认知障碍的预测模型。
BMC Geriatr. 2024 Nov 29;24(1):982. doi: 10.1186/s12877-024-05567-0.
7
Development and validation of a machine learning-based model to predict postoperative overall survival in patients with soft tissue sarcoma: a retrospective cohort study.基于机器学习的软组织肉瘤患者术后总生存预测模型的开发与验证:一项回顾性队列研究
Am J Cancer Res. 2024 Oct 15;14(10):4731-4746. doi: 10.62347/ZQVY3877. eCollection 2024.

本文引用的文献

1
OBLIQUE RANDOM SURVIVAL FORESTS.倾斜随机生存森林
Ann Appl Stat. 2019 Sep;13(3):1847-1883. doi: 10.1214/19-aoas1261. Epub 2019 Oct 17.
2
Demystifying the Black Box: The Importance of Interpretability of Predictive Models in Neurocritical Care.揭开黑箱的神秘面纱:神经危重症中预测模型可解释性的重要性。
Neurocrit Care. 2022 Aug;37(Suppl 2):185-191. doi: 10.1007/s12028-022-01504-4. Epub 2022 May 6.
3
The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models.预测准确性指数:一种用于评估风险预测模型的直观指标。
Diagn Progn Res. 2018 May 4;2:7. doi: 10.1186/s41512-018-0029-2. eCollection 2018.
4
A Selective Review on Random Survival Forests for High Dimensional Data.高维数据随机生存森林的选择性综述
Quant Biosci. 2017;36(2):85-96. doi: 10.22283/qbs.2017.36.2.85.
5
Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival.随机森林回归、分类和生存中变量重要性的标准误差和置信区间。
Stat Med. 2019 Feb 20;38(4):558-582. doi: 10.1002/sim.7803. Epub 2018 Jun 4.
6
Effect of Natriuretic Peptide-Guided Therapy on Hospitalization or Cardiovascular Mortality in High-Risk Patients With Heart Failure and Reduced Ejection Fraction: A Randomized Clinical Trial.利钠肽指导治疗对射血分数降低的高危心力衰竭患者住院率或心血管死亡率的影响:一项随机临床试验。
JAMA. 2017 Aug 22;318(8):713-720. doi: 10.1001/jama.2017.10565.
7
Random survival forest with space extensions for censored data.用于删失数据的具有空间扩展的随机生存森林
Artif Intell Med. 2017 Jun;79:52-61. doi: 10.1016/j.artmed.2017.06.005. Epub 2017 Jun 20.
8
Investigating the Utility of Oblique Tree-Based Ensembles for the Classification of Hyperspectral Data.研究基于斜树的集成方法在高光谱数据分类中的效用。
Sensors (Basel). 2016 Nov 15;16(11):1918. doi: 10.3390/s16111918.
9
Random rotation survival forest for high dimensional censored data.用于高维删失数据的随机旋转生存森林
Springerplus. 2016 Aug 26;5(1):1425. doi: 10.1186/s40064-016-3113-5. eCollection 2016.
10
A Proportional Hazards Regression Model for the Sub-distribution with Covariates Adjusted Censoring Weight for Competing Risks Data.一种用于亚分布的比例风险回归模型,带有针对竞争风险数据的协变量调整截尾权重。
Scand Stat Theory Appl. 2016 Mar;43(1):103-122. doi: 10.1111/sjos.12167. Epub 2015 Jun 5.