对干扰参数进行有针对性的估计以获得有效的统计推断。

Targeted estimation of nuisance parameters to obtain valid statistical inference.

作者信息

van der Laan Mark J

出版信息

Int J Biostat. 2014;10(1):29-57. doi: 10.1515/ijb-2012-0038.

DOI:10.1515/ijb-2012-0038

Abstract

In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special case, we also demonstrate the required targeting of the propensity score for the inverse probability of treatment weighted estimator using super-learning to fit the propensity score.

摘要

为了获得具体结果，我们专注于在控制所有已测量的基线协变量的情况下，对特定治疗均值进行估计，这是基于观察由基线协变量、随后分配的二元治疗以及最终结果组成的随机变量的独立同分布副本。统计模型仅假设在给定协变量（即所谓的倾向得分）的情况下，对治疗的条件分布可能存在限制。特定治疗均值的估计量涉及倾向得分的估计和/或给定治疗及协变量时结果的条件均值的估计。为了使这些估计量在统计模型的任何数据分布下渐近无偏，使用诸如集成学习（特别是超学习）等数据自适应估计量来估计这些干扰参数至关重要。由于此类估计量涉及相对于无限维干扰参数本身的偏差和方差的最优权衡，它们会导致对被估计量的实值估计量产生次优的偏差/方差权衡。我们证明，对这些干扰参数的估计量进行额外的针对性处理可确保被估计量的这种偏差是二阶的，从而使我们能够证明在正则条件下特定治疗均值估计量的渐近线性定理。这些见解产生了新颖的基于目标最小损失的估计量（TMLEs），它们使用集成学习并进行额外的目标偏差减少来构建干扰参数的估计量。特别是，我们构建了具有已知影响曲线的协作TMLEs（C-TMLEs），允许进行统计推断，尽管这些C-TMLEs基于衡量倾向得分的最终拟合在消除被估计量偏差方面的有效性的标准，对倾向得分进行变量选择。作为一个特殊的特殊情况，我们还展示了使用超学习来拟合倾向得分的治疗加权逆概率估计量对倾向得分的所需针对性处理。

相似文献

Targeted estimation of nuisance parameters to obtain valid statistical inference.对干扰参数进行有针对性的估计以获得有效的统计推断。

Int J Biostat. 2014;10(1):29-57. doi: 10.1515/ijb-2012-0038.

Collaborative double robust targeted maximum likelihood estimation.协作双稳健靶向最大似然估计

Int J Biostat. 2010 May 17;6(1):Article 17. doi: 10.2202/1557-4679.1181.

Data-Adaptive Bias-Reduced Doubly Robust Estimation.数据自适应偏差减少的双重稳健估计

Int J Biostat. 2016 May 1;12(1):253-82. doi: 10.1515/ijb-2015-0029.

Double Robust Efficient Estimators of Longitudinal Treatment Effects: Comparative Performance in Simulations and a Case Study.纵向治疗效果的双重稳健有效估计量：模拟中的比较性能及一个案例研究

Int J Biostat. 2019 Feb 26;15(2):/j/ijb.2019.15.issue-2/ijb-2017-0054/ijb-2017-0054.xml. doi: 10.1515/ijb-2017-0054.

A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso.一种基于高度自适应套索的一般有效基于靶向最小损失的估计器。

Int J Biostat. 2017 Oct 12;13(2):/j/ijb.2017.13.issue-2/ijb-2015-0097/ijb-2015-0097.xml. doi: 10.1515/ijb-2015-0097.

Causal Inference for a Population of Causally Connected Units.因果关联单元总体的因果推断

J Causal Inference. 2014 Mar;2(1):13-74. doi: 10.1515/jci-2013-0002.

Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data.基于协作控制 LASSO 的高维数据倾向评分匹配估计量的构建

Stat Methods Med Res. 2019 Apr;28(4):1044-1063. doi: 10.1177/0962280217744588. Epub 2017 Dec 11.

Scalable collaborative targeted learning for high-dimensional data.可扩展的高维数据协作靶向学习。

Stat Methods Med Res. 2019 Feb;28(2):532-554. doi: 10.1177/0962280217729845. Epub 2017 Sep 22.

Estimators and confidence intervals for the marginal odds ratio using logistic regression and propensity score stratification.使用逻辑回归和倾向评分分层法估计边缘比值比的估计值和置信区间。

Stat Med. 2010 Mar 30;29(7-8):760-9. doi: 10.1002/sim.3811.

A Case Study of the Impact of Data-Adaptive Versus Model-Based Estimation of the Propensity Scores on Causal Inferences from Three Inverse Probability Weighting Estimators.数据自适应与基于模型的倾向得分估计对三种逆概率加权估计器因果推断影响的案例研究

Int J Biostat. 2016 May 1;12(1):131-55. doi: 10.1515/ijb-2015-0028.

引用本文的文献

Nonparametric assessment of regimen response curve estimators.方案反应曲线估计量的非参数评估。

Biometrics. 2025 Apr 2;81(2). doi: 10.1093/biomtc/ujaf066.

Developing a synthetic control group using electronic health records: Application to a single-arm lifestyle intervention study.利用电子健康记录建立合成对照组：在单臂生活方式干预研究中的应用。

Prev Med Rep. 2021 Oct 4;24:101572. doi: 10.1016/j.pmedr.2021.101572. eCollection 2021 Dec.

Complier stochastic direct effects: identification and robust estimation.依从性随机直接效应：识别与稳健估计

J Am Stat Assoc. 2021;116(535):1254-1264. doi: 10.1080/01621459.2019.1704292. Epub 2020 Jan 23.

Demystifying Statistical Inference When Using Machine Learning in Causal Research.在因果研究中使用机器学习时揭开统计推断的神秘面纱。

Am J Epidemiol. 2021 Jul 15;192(9):1545-9. doi: 10.1093/aje/kwab200.

Robust Q-learning.稳健Q学习

J Am Stat Assoc. 2021;116(533):368-381. doi: 10.1080/01621459.2020.1753522. Epub 2020 Jun 8.

Estimating mean potential outcome under adaptive treatment length strategies in continuous time.在连续时间下，自适应治疗时长策略下的平均潜在结果估计。

Biometrics. 2022 Dec;78(4):1503-1514. doi: 10.1111/biom.13504. Epub 2021 Jun 15.

Thirteen Questions About Using Machine Learning in Causal Research (You Won't Believe the Answer to Number 10!).使用机器学习进行因果研究的十三个问题（你不会相信问题 10 的答案！）！

Am J Epidemiol. 2021 Aug 1;190(8):1476-1482. doi: 10.1093/aje/kwab047.

Efficiently transporting causal direct and indirect effects to new populations under intermediate confounding and with multiple mediators.在中介性混杂和存在多个中介变量的情况下，有效地将因果直接和间接效应传递到新的群体中。

Biostatistics. 2022 Jul 18;23(3):789-806. doi: 10.1093/biostatistics/kxaa057.

Exercise During the First Trimester of Pregnancy and the Risks of Abnormal Screening and Gestational Diabetes Mellitus.孕期第一 trimester 运动与异常筛查和妊娠期糖尿病风险的关系。

Diabetes Care. 2021 Feb;44(2):425-432. doi: 10.2337/dc20-1475. Epub 2020 Dec 21.

Far from MCAR: Obtaining Population-level Estimates of HIV Viral Suppression.远非 MCAR：获得 HIV 病毒抑制的人群水平估计值。

Epidemiology. 2020 Sep;31(5):620-627. doi: 10.1097/EDE.0000000000001215.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

对干扰参数进行有针对性的估计以获得有效的统计推断。

Targeted estimation of nuisance parameters to obtain valid statistical inference.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献