使用连续尖峰和平板先验的高维混杂调整

High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors.

作者信息

Antonelli Joseph, Parmigiani Giovanni, Dominici Francesca

机构信息

Department of Statistics, University of Florida, 102 Griffin-Floyd Hall, P.O. Box 118545, Gainesville, Fl, 32611, USA.

Department of Biostatistics and Computational Biology, CLS 11007, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA, 02215, USA.

出版信息

Bayesian Anal. 2019 Sep;14(3):805-828. doi: 10.1214/18-ba1131. Epub 2019 Jun 11.

DOI:10.1214/18-ba1131

PMID:32431779

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7236769/

Abstract

In observational studies, estimation of a causal effect of a treatment on an outcome relies on proper adjustment for confounding. If the number of the potential confounders () is larger than the number of observations (), then direct control for all potential confounders is infeasible. Existing approaches for dimension reduction and penalization are generally aimed at predicting the outcome, and are less suited for estimation of causal effects. Under standard penalization approaches (e.g. Lasso), if a variable is strongly associated with the treatment but weakly with the outcome , the coefficient will be shrunk towards zero thus leading to confounding bias. Under the assumption of a linear model for the outcome and sparsity, we propose continuous spike and slab priors on the regression coefficients corresponding to the potential confounders . Specifically, we introduce a prior distribution that does not heavily shrink to zero the coefficients ( s) of the s that are strongly associated with but weakly associated with . We compare our proposed approach to several state of the art methods proposed in the literature. Our proposed approach has the following features: 1) it reduces confounding bias in high dimensional settings; 2) it shrinks towards zero coefficients of instrumental variables; and 3) it achieves good coverages even in small sample sizes. We apply our approach to the National Health and Nutrition Examination Survey (NHANES) data to estimate the causal effects of persistent pesticide exposure on triglyceride levels.

摘要

在观察性研究中，对治疗对结果的因果效应进行估计依赖于对混杂因素的适当调整。如果潜在混杂因素的数量（）大于观察值的数量（），那么直接控制所有潜在混杂因素是不可行的。现有的降维和惩罚方法通常旨在预测结果，不太适合估计因果效应。在标准惩罚方法（如套索回归）下，如果一个变量与治疗强烈相关但与结果弱相关，系数将向零收缩，从而导致混杂偏差。在结果的线性模型和稀疏性假设下，我们针对与潜在混杂因素对应的回归系数提出连续尖峰和平板先验。具体来说，我们引入一种先验分布，该分布不会将与强烈相关但与弱相关的的系数（）严重收缩至零。我们将我们提出的方法与文献中提出的几种先进方法进行比较。我们提出的方法具有以下特点：1）它在高维设置中减少混杂偏差；2）它将工具变量的系数向零收缩；3）即使在小样本量下也能实现良好的覆盖率。我们将我们的方法应用于国家健康和营养检查调查（NHANES）数据，以估计持续接触农药对甘油三酯水平的因果效应。

相似文献

High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors.

Bayesian Anal. 2019 Sep;14(3):805-828. doi: 10.1214/18-ba1131. Epub 2019 Jun 11.

Variable selection and estimation in causal inference using Bayesian spike and slab priors.

Stat Methods Med Res. 2020 Sep;29(9):2445-2469. doi: 10.1177/0962280219898497. Epub 2020 Jan 15.

The spike-and-slab lasso and scalable algorithm to accommodate multinomial outcomes in variable selection problems.

J Appl Stat. 2023 Sep 14;51(11):2039-2061. doi: 10.1080/02664763.2023.2258301. eCollection 2024.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Doubly robust matching estimators for high dimensional confounding adjustment.

Biometrics. 2018 Dec;74(4):1171-1179. doi: 10.1111/biom.12887. Epub 2018 May 11.

Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies.

Stat Methods Med Res. 2004 Feb;13(1):17-48. doi: 10.1191/0962280204sm351ra.

A ROBUST AND EFFICIENT APPROACH TO CAUSAL INFERENCE BASED ON SPARSE SUFFICIENT DIMENSION REDUCTION.

Ann Stat. 2019 Jun;47(3):1505-1535. doi: 10.1214/18-AOS1722. Epub 2019 Feb 13.

Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.

Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.

Bayesian penalty methods for evaluating measurement invariance in moderated nonlinear factor analysis.

Psychol Methods. 2025 Jun;30(3):482-512. doi: 10.1037/met0000552. Epub 2023 Jun 8.

Bayesian semiparametric multiple shrinkage.

Biometrics. 2010 Jun;66(2):455-62. doi: 10.1111/j.1541-0420.2009.01275.x. Epub 2009 Jun 8.

引用本文的文献

Bayesian Estimation of Propensity Scores for Integrating Multiple Cohorts with High-Dimensional Covariates.

Stat Biosci. 2024 Dec 9. doi: 10.1007/s12561-024-09470-5.

Integrated multi-omics analysis identifies SELENOP and PKMYT1 as immune-metabolic hub genes in breast cancer.

Biochem Biophys Rep. 2025 Aug 6;43:102198. doi: 10.1016/j.bbrep.2025.102198. eCollection 2025 Sep.

High-dimensional generalized median adaptive lasso with application to omics data.

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae059.

A causal exposure response function with local adjustment for confounding: Estimating health effects of exposure to low levels of ambient fine particulate matter.

Ann Appl Stat. 2020 Jun;14(2):850-871. doi: 10.1214/20-aoas1330. Epub 2020 Jun 29.

Causal inference in high dimensions: A marriage between Bayesian modeling and good frequentist properties.

Biometrics. 2022 Mar;78(1):100-114. doi: 10.1111/biom.13417. Epub 2020 Dec 31.

Data Science in Environmental Health Research.

Curr Epidemiol Rep. 2019 Sep;6(3):291-299. doi: 10.1007/s40471-019-00205-5. Epub 2019 Jul 15.

本文引用的文献

Uncertainty in the design stage of two-stage Bayesian propensity score analysis.

Stat Med. 2020 Jul 30;39(17):2265-2290. doi: 10.1002/sim.8486. Epub 2020 May 24.

Bayesian factorizations of big sparse tensors.

J Am Stat Assoc. 2015;110(512):1562-1576. doi: 10.1080/01621459.2014.983233. Epub 2016 Jan 15.

Post-Selection Inference for -Penalized Likelihood Models.

Can J Stat. 2018 Mar;46(1):41-61. doi: 10.1002/cjs.11313. Epub 2017 Mar 6.

Doubly robust matching estimators for high dimensional confounding adjustment.

Biometrics. 2018 Dec;74(4):1171-1179. doi: 10.1111/biom.12887. Epub 2018 May 11.

Guided Bayesian imputation to adjust for confounding when combining heterogeneous data sources in comparative effectiveness research.

Biostatistics. 2017 Jul 1;18(3):553-568. doi: 10.1093/biostatistics/kxx003.

Outcome-adaptive lasso: Variable selection for causal inference.

Biometrics. 2017 Dec;73(4):1111-1122. doi: 10.1111/biom.12679. Epub 2017 Mar 8.

Model averaged double robust estimation.

Biometrics. 2017 Jun;73(2):410-421. doi: 10.1111/biom.12622. Epub 2016 Nov 28.

A database of human exposomes and phenomes from the US National Health and Nutrition Examination Survey.

Sci Data. 2016 Oct 25;3:160096. doi: 10.1038/sdata.2016.96.

Dirichlet-Laplace priors for optimal shrinkage.

J Am Stat Assoc. 2015 Dec 1;110(512):1479-1490. doi: 10.1080/01621459.2014.960967. Epub 2014 Sep 25.

Accounting for uncertainty in confounder and effect modifier selection when estimating average causal effects in generalized linear models.

Biometrics. 2015 Sep;71(3):654-65. doi: 10.1111/biom.12315. Epub 2015 Apr 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用连续尖峰和平板先验的高维混杂调整

High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors.

作者信息

Antonelli Joseph, Parmigiani Giovanni, Dominici Francesca

机构信息

Department of Statistics, University of Florida, 102 Griffin-Floyd Hall, P.O. Box 118545, Gainesville, Fl, 32611, USA.

Department of Biostatistics and Computational Biology, CLS 11007, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA, 02215, USA.

出版信息

Bayesian Anal. 2019 Sep;14(3):805-828. doi: 10.1214/18-ba1131. Epub 2019 Jun 11.

DOI:10.1214/18-ba1131

PMID:32431779

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7236769/

Abstract

摘要

使用连续尖峰和平板先验的高维混杂调整

High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用连续尖峰和平板先验的高维混杂调整

High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors.

作者信息

机构信息

出版信息