Suppr超能文献

提升算法的演进。从机器学习到统计建模。

The evolution of boosting algorithms. From machine learning to statistical modelling.

作者信息

Mayr A, Binder H, Gefeller O, Schmid M

机构信息

Andreas Mayr, Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen- Nürnberg (FAU), Waldstr. 6, 91054 Erlangen, Germany, E-mail:

出版信息

Methods Inf Med. 2014;53(6):419-27. doi: 10.3414/ME13-01-0122. Epub 2014 Aug 12.

Abstract

BACKGROUND

The concept of boosting emerged from the field of machine learning. The basic idea is to boost the accuracy of a weak classifying tool by combining various instances into a more accurate prediction. This general concept was later adapted to the field of statistical modelling. Nowadays, boosting algorithms are often applied to estimate and select predictor effects in statistical regression models.

OBJECTIVES

This review article attempts to highlight the evolution of boosting algorithms from machine learning to statistical modelling.

METHODS

We describe the AdaBoost algorithm for classification as well as the two most prominent statistical boosting approaches, gradient boosting and likelihood-based boosting for statistical modelling. We highlight the methodological background and present the most common software implementations.

RESULTS

Although gradient boosting and likelihood-based boosting are typically treated separately in the literature, they share the same methodological roots and follow the same fundamental concepts. Compared to the initial machine learning algorithms, which must be seen as black-box prediction schemes, they result in statistical models with a straight-forward interpretation.

CONCLUSIONS

Statistical boosting algorithms have gained substantial interest during the last decade and offer a variety of options to address important research questions in modern biomedicine.

摘要

背景

提升概念源自机器学习领域。其基本思想是通过将各种实例组合成更准确的预测来提高弱分类工具的准确性。这一通用概念后来被应用于统计建模领域。如今,提升算法常被用于统计回归模型中估计和选择预测变量效应。

目的

这篇综述文章旨在突出提升算法从机器学习到统计建模的演变。

方法

我们描述了用于分类的AdaBoost算法以及两种最突出的统计提升方法,即用于统计建模的梯度提升和基于似然的提升。我们强调了方法学背景并介绍了最常见的软件实现。

结果

尽管梯度提升和基于似然的提升在文献中通常被分别对待,但它们有着相同的方法学根源并遵循相同的基本概念。与最初必须被视为黑箱预测方案的机器学习算法相比,它们产生的统计模型具有直观的解释。

结论

在过去十年中,统计提升算法引起了广泛关注,并为解决现代生物医学中的重要研究问题提供了多种选择。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验