Department of Statistics, SungKyunKwan University, Seoul, South Korea.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
Biometrics. 2023 Dec;79(4):3252-3265. doi: 10.1111/biom.13833. Epub 2023 Feb 16.
Analysis of observational studies increasingly confronts the challenge of determining which of a possibly high-dimensional set of available covariates are required to satisfy the assumption of ignorable treatment assignment for estimation of causal effects. We propose a Bayesian nonparametric approach that simultaneously (1) prioritizes inclusion of adjustment variables in accordance with existing principles of confounder selection; (2) estimates causal effects in a manner that permits complex relationships among confounders, exposures, and outcomes; and (3) provides causal estimates that account for uncertainty in the nature of confounding. The proposal relies on specification of multiple Bayesian additive regression trees models, linked together with a common prior distribution that accrues posterior selection probability to covariates on the basis of association with both the exposure and the outcome of interest. A set of extensive simulation studies demonstrates that the proposed method performs well relative to similarly-motivated methodologies in a variety of scenarios. We deploy the method to investigate the causal effect of emissions from coal-fired power plants on ambient air pollution concentrations, where the prospect of confounding due to local and regional meteorological factors introduces uncertainty around the confounding role of a high-dimensional set of measured variables. Ultimately, we show that the proposed method produces more efficient and more consistent results across adjacent years than alternative methods, lending strength to the evidence of the causal relationship between SO emissions and ambient particulate pollution.
需要确定在哪些可能的高维可用协变量中进行调整,以满足对因果效应进行估计的可忽略治疗分配的假设。我们提出了一种贝叶斯非参数方法,该方法可以同时:(1) 根据混杂因素选择的现有原则,对纳入调整变量进行优先级排序;(2) 以允许混杂因素、暴露和结果之间存在复杂关系的方式估计因果效应;以及 (3) 提供考虑混杂性质不确定性的因果估计。该方法依赖于多个贝叶斯加性回归树模型的指定,这些模型通过共同的先验分布联系在一起,该分布根据与感兴趣的暴露和结果的关联,为协变量累积后验选择概率。一组广泛的模拟研究表明,与各种情况下的类似动机方法相比,该方法的性能良好。我们将该方法应用于研究燃煤电厂排放对环境空气污染浓度的因果效应,由于当地和区域气象因素引起的混杂的可能性,为一组高维测量变量的混杂作用带来了不确定性。最终,我们表明,与替代方法相比,该方法在相邻年份产生了更有效和更一致的结果,为 SO 排放与环境颗粒物污染之间的因果关系提供了有力证据。