Suppr超能文献

随机森林

Random Forest.

作者信息

Rigatti Steven J

出版信息

J Insur Med. 2017;47(1):31-39. doi: 10.17849/insm-47-01-31-39.1.

Abstract

For the task of analyzing survival data to derive risk factors associated with mortality, physicians, researchers, and biostatisticians have typically relied on certain types of regression techniques, most notably the Cox model. With the advent of more widely distributed computing power, methods which require more complex mathematics have become increasingly common. Particularly in this era of "big data" and machine learning, survival analysis has become methodologically broader. This paper aims to explore one technique known as Random Forest. The Random Forest technique is a regression tree technique which uses bootstrap aggregation and randomization of predictors to achieve a high degree of predictive accuracy. The various input parameters of the random forest are explored. Colon cancer data (n = 66,807) from the SEER database is then used to construct both a Cox model and a random forest model to determine how well the models perform on the same data. Both models perform well, achieving a concordance error rate of approximately 18%.

摘要

对于分析生存数据以得出与死亡率相关的风险因素这一任务,医生、研究人员和生物统计学家通常依赖于某些类型的回归技术,最著名的是Cox模型。随着计算能力更广泛的普及,需要更复杂数学的方法变得越来越普遍。特别是在这个“大数据”和机器学习的时代,生存分析在方法上变得更加广泛。本文旨在探索一种称为随机森林的技术。随机森林技术是一种回归树技术,它使用自助聚合和预测变量的随机化来实现高度的预测准确性。探讨了随机森林的各种输入参数。然后使用来自SEER数据库的结肠癌数据(n = 66,807)构建Cox模型和随机森林模型,以确定这些模型在相同数据上的表现如何。两个模型都表现良好,一致性错误率约为18%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验