Suppr超能文献

交叉验证袋装学习

Cross-Validated Bagged Learning.

作者信息

Petersen Maya L, Molinaro Annette M, Sinisi Sandra E, van der Laan Mark J

机构信息

Division of Biostatistics, University of California, Berkeley, School of Public Health, Earl Warren Hall 7360 Berkeley, California 94720-7360, phone: 510.642.3241 fax: 510.643.5163.

出版信息

J Multivar Anal. 2008 Mar;25(2):260-266. doi: 10.1016/j.jmva.2007.07.004.

Abstract

Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, bootstrap aggregating (bagging) has been introduced as a method to reduce the variance of a given estimator at little cost to bias. Bagging involves applying an estimator to multiple bootstrap samples, and averaging the result across bootstrap samples. In order to address the curse of dimensionality, a common practice has been to apply bagging to estimators which themselves use cross-validation, thereby using cross-validation within a bootstrap sample to select fine-tuning parameters trading off bias and variance of the bootstrap sample-specific candidate estimators. In this article we point out that in order to achieve the correct bias variance trade-off for the parameter of interest, one should apply the cross-validation selector externally to candidate bagged estimators indexed by these fine-tuning parameters. We use three simulations to compare the new cross-validated bagging method with bagging of cross-validated estimators and bagging of non-cross-validated estimators.

摘要

许多应用旨在基于独立同分布观测值的样本,学习数据生成分布的高维参数。例如,目标可能是在给定输入变量列表的情况下估计结果的条件均值。在这种预测背景下,自助聚合(装袋)已被引入作为一种以较小偏差代价降低给定估计器方差的方法。装袋涉及将估计器应用于多个自助样本,并对自助样本的结果进行平均。为了解决维度诅咒问题,一种常见的做法是将装袋应用于本身使用交叉验证的估计器,从而在自助样本内使用交叉验证来选择微调参数,以权衡自助样本特定候选估计器的偏差和方差。在本文中,我们指出,为了实现对感兴趣参数的正确偏差方差权衡,应该在外部将交叉验证选择器应用于由这些微调参数索引的候选装袋估计器。我们使用三个模拟来比较新的交叉验证装袋方法与交叉验证估计器的装袋以及非交叉验证估计器的装袋。

相似文献

1
Cross-Validated Bagged Learning.交叉验证袋装学习
J Multivar Anal. 2008 Mar;25(2):260-266. doi: 10.1016/j.jmva.2007.07.004.
3
Bagged filters for partially observed interacting systems.用于部分观测交互系统的袋装滤波器。
J Am Stat Assoc. 2023;118(542):1078-1089. doi: 10.1080/01621459.2021.1974867. Epub 2021 Oct 4.
4
Collaborative double robust targeted maximum likelihood estimation.协作双稳健靶向最大似然估计
Int J Biostat. 2010 May 17;6(1):Article 17. doi: 10.2202/1557-4679.1181.

引用本文的文献

3
SRIQ clustering: A fusion of Random Forest, QT clustering, and KNN concepts.SRIQ聚类:随机森林、QT聚类和K近邻概念的融合。
Comput Struct Biotechnol J. 2022 Apr 4;20:1567-1579. doi: 10.1016/j.csbj.2022.03.036. eCollection 2022.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验