• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

堆叠泛化:超级学习导论。

Stacked generalization: an introduction to super learning.

机构信息

Department of Epidemiology, University of Pittsburgh, 130 DeSoto Street 503 Parran Hall, Pittsburgh, PA, 15261, USA.

Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, USA.

出版信息

Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.

DOI:10.1007/s10654-018-0390-z
PMID:29637384
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6089257/
Abstract

Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into a host of methods among which is the "Super Learner". Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.

摘要

堆叠泛化是一种集成方法,允许研究人员将几种不同的预测算法组合成一个。自 20 世纪 90 年代初引入以来,该方法已经经历了几次发展,演变成了许多方法,其中包括“超级学习者”。超级学习者使用 V 折交叉验证来构建从候选算法库中预测的最优加权组合。最优性由用户指定的目标函数定义,例如最小化均方误差或最大化接收器操作特征曲线下的面积。尽管本质上相对简单,但由于对概念和技术细节的理解有限,流行病学家对超级学习者的使用受到了阻碍。我们通过两个示例逐步说明概念并解决常见问题。

相似文献

1
Stacked generalization: an introduction to super learning.堆叠泛化:超级学习导论。
Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.
2
Optimal Spatial Prediction Using Ensemble Machine Learning.使用集成机器学习的最优空间预测
Int J Biostat. 2016 May 1;12(1):179-201. doi: 10.1515/ijb-2014-0060.
3
Super learner.超级学习者。
Stat Appl Genet Mol Biol. 2007;6:Article25. doi: 10.2202/1544-6115.1309. Epub 2007 Sep 16.
4
Constrained binary classification using ensemble learning: an application to cost-efficient targeted PrEP strategies.使用集成学习的约束二元分类:在具有成本效益的针对性暴露前预防策略中的应用
Stat Med. 2018 Jan 30;37(2):261-279. doi: 10.1002/sim.7296. Epub 2017 Apr 6.
5
Mortality risk score prediction in an elderly population using machine learning.基于机器学习的老年人群死亡率风险评分预测。
Am J Epidemiol. 2013 Mar 1;177(5):443-52. doi: 10.1093/aje/kws241. Epub 2013 Jan 29.
6
A hybrid super ensemble learning model for the early-stage prediction of diabetes risk.一种用于糖尿病风险早期预测的混合超级集成学习模型。
Med Biol Eng Comput. 2023 Mar;61(3):785-797. doi: 10.1007/s11517-022-02749-z. Epub 2023 Jan 5.
7
Super Learner for Survival Data Prediction.用于生存数据预测的超级学习器。
Int J Biostat. 2020 Feb 22. doi: 10.1515/ijb-2019-0065.
8
Can Hyperparameter Tuning Improve the Performance of a Super Learner?: A Case Study.超参数调优能否提高超级学习者的性能?:一项案例研究。
Epidemiology. 2019 Jul;30(4):521-531. doi: 10.1097/EDE.0000000000001027.
9
Practical considerations for specifying a super learner.指定超级学习者的实用考虑因素。
Int J Epidemiol. 2023 Aug 2;52(4):1276-1285. doi: 10.1093/ije/dyad023.
10
Using electronic health records to identify candidates for human immunodeficiency virus pre-exposure prophylaxis: An application of super learning to risk prediction when the outcome is rare.利用电子健康记录识别人类免疫缺陷病毒暴露前预防的候选者:当结局罕见时,超级学习在风险预测中的应用。
Stat Med. 2020 Oct 15;39(23):3059-3073. doi: 10.1002/sim.8591. Epub 2020 Jun 24.

引用本文的文献

1
Development and validation of a machine learning model for predicting venous thromboembolism complications following colorectal cancer surgery.用于预测结直肠癌手术后静脉血栓栓塞并发症的机器学习模型的开发与验证
Vis Comput Ind Biomed Art. 2025 Sep 12;8(1):22. doi: 10.1186/s42492-025-00204-y.
2
Finding the Optimal Number of Splits and Repetitions in Double Cross-Fitting Targeted Maximum Likelihood Estimators.在双重交叉拟合目标最大似然估计器中寻找最优分割数和重复次数
Pharm Stat. 2025 Sep-Oct;24(5):e70022. doi: 10.1002/pst.70022.
3
Prediction of lymph node metastasis in lung adenocarcinoma using a PET/CT radiomics-based ensemble learning model and its pathological basis.

本文引用的文献

1
Discussion of "Data-driven confounder selection via Markov and Bayesian networks" by Jenny Häggström.珍妮·哈格斯特伦所著《通过马尔可夫和贝叶斯网络进行数据驱动的混杂因素选择》的讨论
Biometrics. 2018 Jun;74(2):399-402. doi: 10.1111/biom.12787. Epub 2017 Nov 2.
2
Data-Adaptive Estimation for Double-Robust Methods in Population-Based Cancer Epidemiology: Risk Differences for Lung Cancer Mortality by Emergency Presentation.基于人群的癌症流行病学中双稳健方法的数据自适应估计:急诊就诊的肺癌死亡率的风险差异。
Am J Epidemiol. 2018 Apr 1;187(4):871-878. doi: 10.1093/aje/kwx317.
3
Estimating the Comparative Effectiveness of Feeding Interventions in the Pediatric Intensive Care Unit: A Demonstration of Longitudinal Targeted Maximum Likelihood Estimation.
基于PET/CT影像组学的集成学习模型预测肺腺癌淋巴结转移及其病理基础
Front Oncol. 2025 Aug 25;15:1618494. doi: 10.3389/fonc.2025.1618494. eCollection 2025.
4
Automated machine learning for classification and regression: A tutorial for psychologists.用于分类和回归的自动化机器学习:心理学家指南
Behav Res Methods. 2025 Aug 18;57(9):262. doi: 10.3758/s13428-025-02684-5.
5
Machine learning-based strategies for improving healthcare data quality: an evaluation of accuracy, completeness, and reusability.基于机器学习的提高医疗数据质量的策略:准确性、完整性和可重用性评估
Front Artif Intell. 2025 Jul 21;8:1621514. doi: 10.3389/frai.2025.1621514. eCollection 2025.
6
Machine learning approaches for EGFR mutation status prediction in NSCLC: an updated systematic review.用于非小细胞肺癌中表皮生长因子受体突变状态预测的机器学习方法:一项更新的系统评价
Front Oncol. 2025 Jul 10;15:1576461. doi: 10.3389/fonc.2025.1576461. eCollection 2025.
7
Improving Distribution Prediction by Integrating Expert Range Maps and Opportunistic Occurrences: Evidence From Japanese Sea Cucumber.通过整合专家范围图和机会性出现情况改进分布预测:来自日本海参的证据
Ecol Evol. 2025 Jul 6;15(7):e71747. doi: 10.1002/ece3.71747. eCollection 2025 Jul.
8
Early diagnosis of autism across developmental stages through scalable and interpretable ensemble model.通过可扩展且可解释的集成模型实现跨发育阶段的自闭症早期诊断。
Front Artif Intell. 2025 May 30;8:1507922. doi: 10.3389/frai.2025.1507922. eCollection 2025.
9
External validation and update of the pediatric asthma risk score as a passive digital marker for childhood asthma using integrated electronic health records.使用综合电子健康记录对儿童哮喘风险评分进行外部验证和更新,作为儿童哮喘的被动数字标志物。
EClinicalMedicine. 2025 May 20;84:103254. doi: 10.1016/j.eclinm.2025.103254. eCollection 2025 Jun.
10
How Effective Are Machine Learning and Doubly Robust Estimators in Incorporating High-Dimensional Proxies to Reduce Residual Confounding?在纳入高维代理变量以减少残余混杂方面,机器学习和双重稳健估计器的效果如何?
Pharmacoepidemiol Drug Saf. 2025 May;34(5):e70155. doi: 10.1002/pds.70155.
评估儿科重症监护病房喂养干预措施的相对有效性:纵向靶向最大似然估计的实证研究
Am J Epidemiol. 2017 Dec 15;186(12):1370-1379. doi: 10.1093/aje/kwx213.
4
Treatment Prediction, Balance, and Propensity Score Adjustment.治疗预测、平衡与倾向评分调整
Epidemiology. 2017 Sep;28(5):e51-e53. doi: 10.1097/EDE.0000000000000657.
5
Constrained binary classification using ensemble learning: an application to cost-efficient targeted PrEP strategies.使用集成学习的约束二元分类:在具有成本效益的针对性暴露前预防策略中的应用
Stat Med. 2018 Jan 30;37(2):261-279. doi: 10.1002/sim.7296. Epub 2017 Apr 6.
6
The Balance Super Learner: A robust adaptation of the Super Learner to improve estimation of the average treatment effect in the treated based on propensity score matching.平衡超级学习者:超级学习者的稳健自适应方法,可提高基于倾向评分匹配的处理组平均处理效应估计的稳健性。
Stat Methods Med Res. 2018 Aug;27(8):2504-2518. doi: 10.1177/0962280216682055. Epub 2016 Dec 15.
7
Second-Order Inference for the Mean of a Variable Missing at Random.随机缺失变量均值的二阶推断
Int J Biostat. 2016 May 1;12(1):333-49. doi: 10.1515/ijb-2015-0031.
8
Imputation approaches for potential outcomes in causal inference.因果推断中潜在结果的插补方法。
Int J Epidemiol. 2015 Oct;44(5):1731-7. doi: 10.1093/ije/dyv135. Epub 2015 Jul 25.
9
Super Learner Analysis of Electronic Adherence Data Improves Viral Prediction and May Provide Strategies for Selective HIV RNA Monitoring.电子依从性数据的超级学习者分析可改善病毒预测,并可能为选择性HIV RNA监测提供策略。
J Acquir Immune Defic Syndr. 2015 May 1;69(1):109-18. doi: 10.1097/QAI.0000000000000548.
10
Variable importance and prediction methods for longitudinal problems with missing variables.具有缺失变量的纵向问题的变量重要性及预测方法。
PLoS One. 2015 Mar 27;10(3):e0120031. doi: 10.1371/journal.pone.0120031. eCollection 2015.