Suppr超能文献

基于机器学习的老年人群死亡率风险评分预测。

Mortality risk score prediction in an elderly population using machine learning.

机构信息

Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA.

出版信息

Am J Epidemiol. 2013 Mar 1;177(5):443-52. doi: 10.1093/aje/kws241. Epub 2013 Jan 29.

Abstract

Standard practice for prediction often relies on parametric regression methods. Interesting new methods from the machine learning literature have been introduced in epidemiologic studies, such as random forest and neural networks. However, a priori, an investigator will not know which algorithm to select and may wish to try several. Here I apply the super learner, an ensembling machine learning approach that combines multiple algorithms into a single algorithm and returns a prediction function with the best cross-validated mean squared error. Super learning is a generalization of stacking methods. I used super learning in the Study of Physical Performance and Age-Related Changes in Sonomans (SPPARCS) to predict death among 2,066 residents of Sonoma, California, aged 54 years or more during the period 1993-1999. The super learner for predicting death (risk score) improved upon all single algorithms in the collection of algorithms, although its performance was similar to that of several algorithms. Super learner outperformed the worst algorithm (neural networks) by 44% with respect to estimated cross-validated mean squared error and had an R2 value of 0.201. The improvement of super learner over random forest with respect to R2 was approximately 2-fold. Alternatives for risk score prediction include the super learner, which can provide improved performance.

摘要

标准预测方法通常依赖于参数回归方法。来自机器学习文献中的一些有趣的新方法已在流行病学研究中得到应用,例如随机森林和神经网络。然而,研究人员事先并不知道应该选择哪种算法,可能希望尝试几种。在这里,我应用了超级学习者,这是一种集成机器学习方法,它将多种算法组合成一个单一的算法,并返回一个具有最佳交叉验证均方误差的预测函数。超级学习是堆叠方法的推广。我在物理性能和 Sonomans 年龄相关变化研究(SPPARCS)中使用超级学习者来预测加利福尼亚州 Sonoma 的 2066 名 54 岁及以上居民在 1993-1999 年期间的死亡情况。用于预测死亡(风险评分)的超级学习者在所有算法集合中都优于所有单个算法,尽管它的性能与几个算法相似。超级学习者在估计的交叉验证均方误差方面比最差算法(神经网络)高出 44%,R2 值为 0.201。超级学习者在 R2 方面相对于随机森林的改进约为 2 倍。风险评分预测的替代方法包括超级学习者,它可以提供更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验