基于协作控制 LASSO 的高维数据倾向评分匹配估计量的构建

Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data.

机构信息

1 Division of Biostatistics, University of California, USA.

2 Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Womens Hospital and Harvard Medical School, USA.

出版信息

Stat Methods Med Res. 2019 Apr;28(4):1044-1063. doi: 10.1177/0962280217744588. Epub 2017 Dec 11.

DOI:10.1177/0962280217744588

PMID:29226777

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6039292/

Abstract

Propensity score-based estimators are increasingly used for causal inference in observational studies. However, model selection for propensity score estimation in high-dimensional data has received little attention. In these settings, propensity score models have traditionally been selected based on the goodness-of-fit for the treatment mechanism itself, without consideration of the causal parameter of interest. Collaborative minimum loss-based estimation is a novel methodology for causal inference that takes into account information on the causal parameter of interest when selecting a propensity score model. This "collaborative learning" considers variable associations with both treatment and outcome when selecting a propensity score model in order to minimize a bias-variance tradeoff in the estimated treatment effect. In this study, we introduce a novel approach for collaborative model selection when using the LASSO estimator for propensity score estimation in high-dimensional covariate settings. To demonstrate the importance of selecting the propensity score model collaboratively, we designed quasi-experiments based on a real electronic healthcare database, where only the potential outcomes were manually generated, and the treatment and baseline covariates remained unchanged. Results showed that the collaborative minimum loss-based estimation algorithm outperformed other competing estimators for both point estimation and confidence interval coverage. In addition, the propensity score model selected by collaborative minimum loss-based estimation could be applied to other propensity score-based estimators, which also resulted in substantive improvement for both point estimation and confidence interval coverage. We illustrate the discussed concepts through an empirical example comparing the effects of non-selective nonsteroidal anti-inflammatory drugs with selective COX-2 inhibitors on gastrointestinal complications in a population of Medicare beneficiaries.

摘要

基于倾向评分的估计量越来越多地用于观察性研究中的因果推断。然而，在高维数据中，倾向评分估计的模型选择很少受到关注。在这些环境中，倾向评分模型传统上是基于对治疗机制本身的拟合优度来选择的，而没有考虑到感兴趣的因果参数。协同最小损失估计是一种新颖的因果推断方法，它在选择倾向评分模型时考虑了与感兴趣的因果参数相关的信息。这种“协同学习”在选择倾向评分模型时考虑了与治疗和结果的变量关联，以最小化估计治疗效果中的偏差方差权衡。在这项研究中，我们介绍了一种在高维协变量环境中使用 LASSO 估计器进行倾向评分估计时进行协同模型选择的新方法。为了证明协同选择倾向评分模型的重要性，我们基于真实的电子医疗保健数据库设计了拟实验，其中仅手动生成潜在结果，而治疗和基线协变量保持不变。结果表明，协同最小损失估计算法在点估计和置信区间覆盖方面均优于其他竞争估计器。此外，协同最小损失估计选择的倾向评分模型可以应用于其他基于倾向评分的估计器，这也导致了点估计和置信区间覆盖的实质性改进。我们通过一个比较非选择性非甾体抗炎药与选择性 COX-2 抑制剂对 Medicare 受益人群胃肠道并发症影响的实证例子来说明讨论的概念。

相似文献

Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data.

Stat Methods Med Res. 2019 Apr;28(4):1044-1063. doi: 10.1177/0962280217744588. Epub 2017 Dec 11.

Scalable collaborative targeted learning for high-dimensional data.

Stat Methods Med Res. 2019 Feb;28(2):532-554. doi: 10.1177/0962280217729845. Epub 2017 Sep 22.

Collaborative double robust targeted maximum likelihood estimation.

Int J Biostat. 2010 May 17;6(1):Article 17. doi: 10.2202/1557-4679.1181.

Targeted estimation of nuisance parameters to obtain valid statistical inference.

Int J Biostat. 2014;10(1):29-57. doi: 10.1515/ijb-2012-0038.

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference.

Int J Biostat. 2016 May 1;12(1):97-115. doi: 10.1515/ijb-2015-0017.

Double Robust Efficient Estimators of Longitudinal Treatment Effects: Comparative Performance in Simulations and a Case Study.

Int J Biostat. 2019 Feb 26;15(2):/j/ijb.2019.15.issue-2/ijb-2017-0054/ijb-2017-0054.xml. doi: 10.1515/ijb-2017-0054.

On adaptive propensity score truncation in causal inference.

Stat Methods Med Res. 2019 Jun;28(6):1741-1760. doi: 10.1177/0962280218774817. Epub 2018 Jul 11.

Variable Selection for Confounding Adjustment in High-dimensional Covariate Spaces When Analyzing Healthcare Databases.

Epidemiology. 2017 Mar;28(2):237-248. doi: 10.1097/EDE.0000000000000581.

Outcome-adaptive lasso: Variable selection for causal inference.

Biometrics. 2017 Dec;73(4):1111-1122. doi: 10.1111/biom.12679. Epub 2017 Mar 8.

Propensity score interval matching: using bootstrap confidence intervals for accommodating estimation errors of propensity scores.

BMC Med Res Methodol. 2015 Jul 28;15:53. doi: 10.1186/s12874-015-0049-3.

引用本文的文献

Commentary on ``Nonparametric identification is not enough, but randomized controlled trials are'': Statistical considerations for generating reliable evidence across a spectrum of studies that increasingly involve real-world elements.

Obs Stud. 2025 Apr 11;11(1):61-76. doi: 10.1353/obs.2025.a956842. eCollection 2025.

A Dynamic Prognostic Model for Identifying Vulnerable COVID-19 Patients at High Risk of Rapid Deterioration.

Pharmacoepidemiol Drug Saf. 2024 Aug;33(8):e5872. doi: 10.1002/pds.5872.

Targeted learning with an undersmoothed LASSO propensity score model for large-scale covariate adjustment in health-care database studies.

Am J Epidemiol. 2024 Nov 4;193(11):1632-1640. doi: 10.1093/aje/kwae023.

Performance of modeling and balancing approach methods when using weights to estimate treatment effects in observational time-to-event settings.

PLoS One. 2023 Dec 7;18(12):e0289316. doi: 10.1371/journal.pone.0289316. eCollection 2023.

An application of the Causal Roadmap in two safety monitoring case studies: Causal inference and outcome prediction using electronic health record data.

J Clin Transl Sci. 2023 Sep 21;7(1):e208. doi: 10.1017/cts.2023.632. eCollection 2023.

Scalable Feature Engineering from Electronic Free Text Notes to Supplement Confounding Adjustment of Claims-Based Pharmacoepidemiologic Studies.

Clin Pharmacol Ther. 2023 Apr;113(4):832-838. doi: 10.1002/cpt.2826. Epub 2023 Jan 11.

Identification of necroptosis-related genes for predicting prognosis and exploring immune infiltration landscape in colon adenocarcinoma.

Front Oncol. 2022 Nov 24;12:941156. doi: 10.3389/fonc.2022.941156. eCollection 2022.

Machine learning for improving high-dimensional proxy confounder adjustment in healthcare database studies: An overview of the current literature.

Pharmacoepidemiol Drug Saf. 2022 Sep;31(9):932-943. doi: 10.1002/pds.5500. Epub 2022 Jul 5.

Synthetic Negative Controls: Using Simulation to Screen Large-scale Propensity Score Analyses.

Epidemiology. 2022 Jul 1;33(4):541-550. doi: 10.1097/EDE.0000000000001482. Epub 2022 Apr 12.

Evaluating the robustness of targeted maximum likelihood estimators via realistic simulations in nutrition intervention trials.

Stat Med. 2022 May 30;41(12):2132-2165. doi: 10.1002/sim.9348. Epub 2022 Feb 16.

本文引用的文献

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods.

J Appl Stat. 2019;46(12):2216-2236. doi: 10.1080/02664763.2019.1582614. Epub 2019 Feb 22.

The Relative Performance of Ensemble Methods with Deep Convolutional Neural Networks for Image Classification.

J Appl Stat. 2018;45(15):2800-2818. doi: 10.1080/02664763.2018.1441383. Epub 2018 Feb 26.

The Highly Adaptive Lasso Estimator.

Proc Int Conf Data Sci Adv Anal. 2016;2016:689-696. doi: 10.1109/DSAA.2016.93. Epub 2016 Dec 26.

Scalable collaborative targeted learning for high-dimensional data.

Stat Methods Med Res. 2019 Feb;28(2):532-554. doi: 10.1177/0962280217729845. Epub 2017 Sep 22.

Online cross-validation-based ensemble learning.

Stat Med. 2018 Jan 30;37(2):249-260. doi: 10.1002/sim.7320. Epub 2017 May 4.

The Potential of High-Dimensional Propensity Scores in Health Services Research: An Exemplary Study on the Quality of Care for Elective Percutaneous Coronary Interventions.

Health Serv Res. 2018 Feb;53(1):197-213. doi: 10.1111/1475-6773.12653. Epub 2017 Jan 16.

Performance of the High-dimensional Propensity Score in a Nordic Healthcare Model.

Basic Clin Pharmacol Toxicol. 2017 Mar;120(3):312-317. doi: 10.1111/bcpt.12716. Epub 2017 Jan 16.

Variable Selection for Confounding Adjustment in High-dimensional Covariate Spaces When Analyzing Healthcare Databases.

Epidemiology. 2017 Mar;28(2):237-248. doi: 10.1097/EDE.0000000000000581.

Performance of the high-dimensional propensity score in adjusting for unmeasured confounders.

Eur J Clin Pharmacol. 2016 Dec;72(12):1497-1505. doi: 10.1007/s00228-016-2118-x. Epub 2016 Aug 30.

Comparison of high-dimensional confounder summary scores in comparative studies of newly marketed medications.

J Clin Epidemiol. 2016 Aug;76:200-8. doi: 10.1016/j.jclinepi.2016.02.011. Epub 2016 Feb 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于协作控制 LASSO 的高维数据倾向评分匹配估计量的构建

Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献