Franklin Jessica M, Eddings Wesley, Glynn Robert J, Schneeweiss Sebastian
Am J Epidemiol. 2015 Oct 1;182(7):651-9. doi: 10.1093/aje/kwv108. Epub 2015 Aug 1.
Selection and measurement of confounders is critical for successful adjustment in nonrandomized studies. Although the principles behind confounder selection are now well established, variable selection for confounder adjustment remains a difficult problem in practice, particularly in secondary analyses of databases. We present a simulation study that compares the high-dimensional propensity score algorithm for variable selection with approaches that utilize direct adjustment for all potential confounders via regularized regression, including ridge regression and lasso regression. Simulations were based on 2 previously published pharmacoepidemiologic cohorts and used the plasmode simulation framework to create realistic simulated data sets with thousands of potential confounders. Performance of methods was evaluated with respect to bias and mean squared error of the estimated effects of a binary treatment. Simulation scenarios varied the true underlying outcome model, treatment effect, prevalence of exposure and outcome, and presence of unmeasured confounding. Across scenarios, high-dimensional propensity score approaches generally performed better than regularized regression approaches. However, including the variables selected by lasso regression in a regular propensity score model also performed well and may provide a promising alternative variable selection method.
在非随机研究中,混杂因素的选择和测量对于成功进行调整至关重要。尽管现在混杂因素选择背后的原则已经确立,但在实践中,尤其是在数据库的二次分析中,选择用于混杂因素调整的变量仍然是一个难题。我们开展了一项模拟研究,将用于变量选择的高维倾向评分算法与通过正则回归对所有潜在混杂因素进行直接调整的方法进行比较,这些方法包括岭回归和套索回归。模拟基于2个先前发表的药物流行病学队列,并使用模拟框架创建了具有数千个潜在混杂因素的逼真模拟数据集。根据二元治疗估计效应的偏差和均方误差对方法的性能进行评估。模拟场景改变了真实的潜在结局模型、治疗效果、暴露和结局的患病率以及未测量混杂因素的存在情况。在各种场景下,高维倾向评分方法通常比正则回归方法表现更好。然而,将套索回归选择的变量纳入常规倾向评分模型也表现良好,可能提供一种有前景的替代变量选择方法。