Department of Statistics, University of Florida, Gainesville, Florida, USA.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
Biometrics. 2022 Mar;78(1):100-114. doi: 10.1111/biom.13417. Epub 2020 Dec 31.
We introduce a framework for estimating causal effects of binary and continuous treatments in high dimensions. We show how posterior distributions of treatment and outcome models can be used together with doubly robust estimators. We propose an approach to uncertainty quantification for the doubly robust estimator, which utilizes posterior distributions of model parameters and (1) results in good frequentist properties in small samples, (2) is based on a single run of a Markov chain Monte Carlo (MCMC) algorithm, and (3) improves over frequentist measures of uncertainty which rely on asymptotic properties. We consider a flexible framework for modeling the treatment and outcome processes within the Bayesian paradigm that reduces model dependence, accommodates nonlinearity, and achieves dimension reduction of the covariate space. We illustrate the ability of the proposed approach to flexibly estimate causal effects in high dimensions and appropriately quantify uncertainty. We show that our proposed variance estimation strategy is consistent when both models are correctly specified, and we see empirically that it performs well in finite samples and under model misspecification. Finally, we estimate the effect of continuous environmental exposures on cholesterol and triglyceride levels.
我们介绍了一种用于在高维中估计二分类和连续治疗效果的因果效应的框架。我们展示了如何将治疗和结果模型的后验分布与双重稳健估计器一起使用。我们提出了一种针对双重稳健估计器的不确定性量化方法,该方法利用模型参数的后验分布,(1)在小样本中产生良好的频率性质,(2)基于马尔可夫链蒙特卡罗(MCMC)算法的单次运行,(3)改进了基于渐近性质的频率不确定性度量。我们考虑了一种灵活的贝叶斯框架内的处理和结果过程建模框架,该框架减少了模型依赖性,适应了非线性,并实现了协变量空间的降维。我们说明了所提出的方法在高维中灵活地估计因果效应并适当量化不确定性的能力。我们表明,当两个模型都正确指定时,我们提出的方差估计策略是一致的,并且我们从经验中看到,它在有限样本和模型失配下表现良好。最后,我们估计了连续环境暴露对胆固醇和甘油三酯水平的影响。