Suppr超能文献

堆叠泛化:超级学习导论。

Stacked generalization: an introduction to super learning.

机构信息

Department of Epidemiology, University of Pittsburgh, 130 DeSoto Street 503 Parran Hall, Pittsburgh, PA, 15261, USA.

Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, USA.

出版信息

Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.

Abstract

Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into a host of methods among which is the "Super Learner". Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.

摘要

堆叠泛化是一种集成方法,允许研究人员将几种不同的预测算法组合成一个。自 20 世纪 90 年代初引入以来,该方法已经经历了几次发展,演变成了许多方法,其中包括“超级学习者”。超级学习者使用 V 折交叉验证来构建从候选算法库中预测的最优加权组合。最优性由用户指定的目标函数定义,例如最小化均方误差或最大化接收器操作特征曲线下的面积。尽管本质上相对简单,但由于对概念和技术细节的理解有限,流行病学家对超级学习者的使用受到了阻碍。我们通过两个示例逐步说明概念并解决常见问题。

相似文献

1
Stacked generalization: an introduction to super learning.
Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.
2
Optimal Spatial Prediction Using Ensemble Machine Learning.
Int J Biostat. 2016 May 1;12(1):179-201. doi: 10.1515/ijb-2014-0060.
3
Super learner.
Stat Appl Genet Mol Biol. 2007;6:Article25. doi: 10.2202/1544-6115.1309. Epub 2007 Sep 16.
5
Mortality risk score prediction in an elderly population using machine learning.
Am J Epidemiol. 2013 Mar 1;177(5):443-52. doi: 10.1093/aje/kws241. Epub 2013 Jan 29.
6
A hybrid super ensemble learning model for the early-stage prediction of diabetes risk.
Med Biol Eng Comput. 2023 Mar;61(3):785-797. doi: 10.1007/s11517-022-02749-z. Epub 2023 Jan 5.
7
Super Learner for Survival Data Prediction.
Int J Biostat. 2020 Feb 22. doi: 10.1515/ijb-2019-0065.
8
Can Hyperparameter Tuning Improve the Performance of a Super Learner?: A Case Study.
Epidemiology. 2019 Jul;30(4):521-531. doi: 10.1097/EDE.0000000000001027.
9
Practical considerations for specifying a super learner.
Int J Epidemiol. 2023 Aug 2;52(4):1276-1285. doi: 10.1093/ije/dyad023.

引用本文的文献

4
Automated machine learning for classification and regression: A tutorial for psychologists.
Behav Res Methods. 2025 Aug 18;57(9):262. doi: 10.3758/s13428-025-02684-5.
5
Machine learning-based strategies for improving healthcare data quality: an evaluation of accuracy, completeness, and reusability.
Front Artif Intell. 2025 Jul 21;8:1621514. doi: 10.3389/frai.2025.1621514. eCollection 2025.
6
Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review.
Front Oncol. 2025 Jul 10;15:1576461. doi: 10.3389/fonc.2025.1576461. eCollection 2025.
8
Early diagnosis of autism across developmental stages through scalable and interpretable ensemble model.
Front Artif Intell. 2025 May 30;8:1507922. doi: 10.3389/frai.2025.1507922. eCollection 2025.

本文引用的文献

1
Discussion of "Data-driven confounder selection via Markov and Bayesian networks" by Jenny Häggström.
Biometrics. 2018 Jun;74(2):399-402. doi: 10.1111/biom.12787. Epub 2017 Nov 2.
4
Treatment Prediction, Balance, and Propensity Score Adjustment.
Epidemiology. 2017 Sep;28(5):e51-e53. doi: 10.1097/EDE.0000000000000657.
7
Second-Order Inference for the Mean of a Variable Missing at Random.
Int J Biostat. 2016 May 1;12(1):333-49. doi: 10.1515/ijb-2015-0031.
8
Imputation approaches for potential outcomes in causal inference.
Int J Epidemiol. 2015 Oct;44(5):1731-7. doi: 10.1093/ije/dyv135. Epub 2015 Jul 25.
10
Variable importance and prediction methods for longitudinal problems with missing variables.
PLoS One. 2015 Mar 27;10(3):e0120031. doi: 10.1371/journal.pone.0120031. eCollection 2015.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验