针对目标高度自适应最小绝对收缩与选择算子（LASSO）估计量的非参数自助推断

Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (LASSO) estimator.

作者信息

Cai Weixin, van der Laan Mark

机构信息

Division of Biostatistics, University of California, Berkeley, USA.

出版信息

Int J Biostat. 2020 Aug 10. doi: 10.1515/ijb-2017-0070.

DOI:10.1515/ijb-2017-0070

PMID:32772002

Abstract

The Highly-Adaptive least absolute shrinkage and selection operator (LASSO) Targeted Minimum Loss Estimator (HAL-TMLE) is an efficient plug-in estimator of a pathwise differentiable parameter in a statistical model that at minimal (and possibly only) assumes that the sectional variation norm of the true nuisance functions (i.e., relevant part of data distribution) are finite. It relies on an initial estimator (HAL-MLE) of the nuisance functions by minimizing the empirical risk over the parameter space under the constraint that the sectional variation norm of the candidate functions are bounded by a constant, where this constant can be selected with cross-validation. In this article we establish that the nonparametric bootstrap for the HAL-TMLE, fixing the value of the sectional variation norm at a value larger or equal than the cross-validation selector, provides a consistent method for estimating the normal limit distribution of the HAL-TMLE. In order to optimize the finite sample coverage of the nonparametric bootstrap confidence intervals, we propose a selection method for this sectional variation norm that is based on running the nonparametric bootstrap for all values of the sectional variation norm larger than the one selected by cross-validation, and subsequently determining a value at which the width of the resulting confidence intervals reaches a plateau. We demonstrate our method for 1) nonparametric estimation of the average treatment effect when observing a covariate vector, binary treatment, and outcome, and for 2) nonparametric estimation of the integral of the square of the multivariate density of the data distribution. In addition, we also present simulation results for these two examples demonstrating the excellent finite sample coverage of bootstrap-based confidence intervals.

摘要

高度自适应的最小绝对收缩与选择算子（LASSO）靶向最小损失估计器（HAL-TMLE）是统计模型中路径可微参数的一种有效插件估计器，该模型至少（且可能仅）假设真实干扰函数（即数据分布的相关部分）的截面变差范数是有限的。它依赖于干扰函数的初始估计器（HAL-MLE），通过在候选函数的截面变差范数由一个常数界定的约束下，在参数空间上最小化经验风险来得到，其中这个常数可以通过交叉验证来选择。在本文中，我们证明了对于HAL-TMLE的非参数自助法，将截面变差范数的值固定为大于或等于交叉验证选择的值，为估计HAL-TMLE的正态极限分布提供了一种一致的方法。为了优化非参数自助置信区间的有限样本覆盖率，我们提出了一种针对此截面变差范数的选择方法，该方法基于对所有大于交叉验证所选值的截面变差范数进行非参数自助，然后确定一个使得所得置信区间宽度达到平稳的值。我们展示了我们的方法用于1）在观察协变量向量、二元处理和结果时非参数估计平均处理效应，以及2）非参数估计数据分布的多元密度平方的积分。此外，我们还给出了这两个例子的模拟结果，证明了基于自助法的置信区间具有出色的有限样本覆盖率。

相似文献

Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (LASSO) estimator.针对目标高度自适应最小绝对收缩与选择算子（LASSO）估计量的非参数自助推断

Int J Biostat. 2020 Aug 10. doi: 10.1515/ijb-2017-0070.

A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso.一种基于高度自适应套索的一般有效基于靶向最小损失的估计器。

Int J Biostat. 2017 Oct 12;13(2):/j/ijb.2017.13.issue-2/ijb-2015-0097/ijb-2015-0097.xml. doi: 10.1515/ijb-2015-0097.

Collaborative double robust targeted maximum likelihood estimation.协作双稳健靶向最大似然估计

Int J Biostat. 2010 May 17;6(1):Article 17. doi: 10.2202/1557-4679.1181.

Efficient estimation of pathwise differentiable target parameters with the undersmoothed highly adaptive lasso.高效估计具有欠平滑高度自适应套索的路径可微目标参数。

Int J Biostat. 2022 Jul 15;19(1):261-289. doi: 10.1515/ijb-2019-0092. eCollection 2023 May 1.

Targeted Learning of the Mean Outcome under an Optimal Dynamic Treatment Rule.在最优动态治疗规则下对平均结局的靶向学习

J Causal Inference. 2015 Mar;3(1):61-95. doi: 10.1515/jci-2013-0022.

Targeted maximum likelihood estimation for prediction calibration.用于预测校准的靶向最大似然估计。

Int J Biostat. 2012 Oct 31;8(1):30. doi: 10.1515/1557-4679.1385.

The Highly Adaptive Lasso Estimator.高度自适应套索估计器

Proc Int Conf Data Sci Adv Anal. 2016;2016:689-696. doi: 10.1109/DSAA.2016.93. Epub 2016 Dec 26.

One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels.基于通用最不利一维子模型的一步靶向最小损失估计

Int J Biostat. 2016 May 1;12(1):351-78. doi: 10.1515/ijb-2015-0054.

Optimal Nonparametric Inference with Two-Scale Distributional Nearest Neighbors.基于双尺度分布最近邻的最优非参数推断

J Am Stat Assoc. 2024;119(545):297-307. doi: 10.1080/01621459.2022.2115375. Epub 2022 Oct 5.

Doubly robust inference for targeted minimum loss-based estimation in randomized trials with missing outcome data.在存在结局数据缺失的随机试验中，基于目标最小损失估计的双重稳健推断。

Stat Med. 2017 Oct 30;36(24):3807-3819. doi: 10.1002/sim.7389. Epub 2017 Jul 25.

引用本文的文献

Identification of a hypoxia-related gene signature associated with childhood asthma.与儿童哮喘相关的缺氧相关基因特征的鉴定。

Genes Genomics. 2025 Aug 18. doi: 10.1007/s13258-025-01665-4.

Identification and validation of anoikis-related differentially expressed genes in nasopharyngeal carcinoma.鼻咽癌中失巢凋亡相关差异表达基因的鉴定与验证

Transl Cancer Res. 2025 Jul 30;14(7):4429-4446. doi: 10.21037/tcr-2025-1263. Epub 2025 Jul 27.

Construction of a risk and prognostic model for migrasome-associated lncRNAs in renal cell carcinoma.肾细胞癌中与迁移体相关的长链非编码RNA的风险和预后模型构建

Sci Rep. 2025 Jul 23;15(1):26760. doi: 10.1038/s41598-025-10630-w.

Performance of Cross-Validated Targeted Maximum Likelihood Estimation.交叉验证的靶向最大似然估计的性能

Stat Med. 2025 Jul;44(15-17):e70185. doi: 10.1002/sim.70185.

Cellular senescence defining the disease characteristics of Crohn's disease.细胞衰老定义了克罗恩病的疾病特征。

Front Immunol. 2025 Jun 30;16:1616531. doi: 10.3389/fimmu.2025.1616531. eCollection 2025.

Cross-talk between mitophagy pathways in pre-eclampsia and gestational diabetes mellitus: a systematic analysis of shared molecular mechanisms.子痫前期与妊娠期糖尿病中线粒体自噬途径之间的相互作用：共享分子机制的系统分析

Eur J Med Res. 2025 Jul 3;30(1):568. doi: 10.1186/s40001-025-02823-w.

Oxidative stress gene expression in ulcerative colitis: implications for colon cancer biomarker discovery.溃疡性结肠炎中的氧化应激基因表达：对结肠癌生物标志物发现的意义

Sci Rep. 2025 Jul 2;15(1):22641. doi: 10.1038/s41598-025-05108-8.

Prognostic role of tumor microenvironment and immune- and autophagy-related genes in colorectal adenocarcinoma.肿瘤微环境以及免疫和自噬相关基因在结直肠癌中的预后作用

Transl Cancer Res. 2025 May 30;14(5):2835-2857. doi: 10.21037/tcr-24-1708. Epub 2025 May 27.

Commentary on ``Nonparametric identification is not enough, but randomized controlled trials are'': Statistical considerations for generating reliable evidence across a spectrum of studies that increasingly involve real-world elements.对《非参数识别不够，但随机对照试验足够》的评论：在越来越多地涉及现实世界因素的一系列研究中生成可靠证据的统计考量。

Obs Stud. 2025 Apr 11;11(1):61-76. doi: 10.1353/obs.2025.a956842. eCollection 2025.

Development of machine learning models to predict the risk of fungal infection following flexible ureteroscopy lithotripsy.开发机器学习模型以预测软性输尿管镜碎石术后真菌感染的风险。

BMC Med Inform Decis Mak. 2025 Apr 10;25(1):159. doi: 10.1186/s12911-025-02987-9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

针对目标高度自适应最小绝对收缩与选择算子（LASSO）估计量的非参数自助推断

Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (LASSO) estimator.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献