Li Yun, Lee Yoonseok, Port Friedrich K, Robinson Bruce M
Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
Stat Methods Med Res. 2020 Aug;29(8):2119-2139. doi: 10.1177/0962280219883323. Epub 2019 Nov 7.
Unmeasured confounding almost always exists in observational studies and can bias estimates of exposure effects. Instrumental variable methods are popular choices in combating unmeasured confounding to obtain less biased effect estimates. However, we demonstrate that alternative methods may give less biased estimates depending on the nature of unmeasured confounding. Treatment preferences of clusters (e.g. physician practices) are the most frequently used instruments in instrumental variable analyses. These preference-based instrumental variable analyses are usually conducted on data clustered by region, hospital/facility, or physician, where unmeasured confounding often occurs within or between clusters. We aim to quantify the impact of unmeasured confounding on the bias of effect estimators in instrumental variable analysis, as well as several common alternative methods including ordinary least squares regression, linear mixed models, and fixed-effect models to study the effect of a continuous exposure (e.g. treatment dose) on a continuous outcome. We derive closed-form expressions of asymptotic bias of estimators from these four methods in the presence of unmeasured within- and/or between-cluster confounders. Simulations demonstrate that the asymptotic bias formulae well approximate bias in finite samples for all methods. The bias formulae show that instrumental variable analyses can provide consistent estimates when unmeasured within-cluster confounding exists, but not when between-cluster confounding exists. On the other hand, fixed-effect models and linear mixed models can provide consistent estimates when unmeasured between-cluster confounding exits, but not for within-cluster confounding. Whether instrumental variable analyses are advantageous in reducing bias over fixed-effect models and linear mixed models depends on the extent of unmeasured within-cluster confounding relative to between-cluster confounding. Furthermore, the impact of unmeasured between-cluster confounding on instrumental variable analysis estimates is larger than the impact of unmeasured within-cluster confounding on fixed-effect model and linear mixed model estimates. We illustrate the use of these methods in estimating the effect of erythropoiesis stimulating agents on hemoglobin levels. Our findings provide guidance for choosing appropriate methods to combat the dominant types of unmeasured confounders and help interpret statistical results in the context of unmeasured confounding.
在观察性研究中,未测量的混杂因素几乎总是存在,并且可能会使暴露效应的估计产生偏差。工具变量法是对抗未测量混杂因素以获得偏差较小的效应估计值的常用选择。然而,我们证明,根据未测量混杂因素的性质,替代方法可能会给出偏差较小的估计值。聚类(例如医生执业情况)的治疗偏好是工具变量分析中最常用的工具。这些基于偏好的工具变量分析通常是对按地区、医院/机构或医生聚类的数据进行的,其中未测量的混杂因素经常在聚类内部或聚类之间出现。我们旨在量化未测量混杂因素对工具变量分析中效应估计量偏差的影响,以及几种常见的替代方法,包括普通最小二乘回归、线性混合模型和固定效应模型,以研究连续暴露(例如治疗剂量)对连续结局的影响。我们推导了在存在未测量的聚类内和/或聚类间混杂因素的情况下,这四种方法估计量的渐近偏差的闭式表达式。模拟表明,对于所有方法,渐近偏差公式在有限样本中能很好地近似偏差。偏差公式表明,当存在未测量的聚类内混杂因素时,工具变量分析可以提供一致的估计值,但当存在聚类间混杂因素时则不然。另一方面,当存在未测量的聚类间混杂因素时,固定效应模型和线性混合模型可以提供一致的估计值,但对于聚类内混杂因素则不行。工具变量分析在减少偏差方面是否比固定效应模型和线性混合模型更具优势,取决于未测量的聚类内混杂因素相对于聚类间混杂因素的程度。此外,未测量的聚类间混杂因素对工具变量分析估计值的影响大于未测量的聚类内混杂因素对固定效应模型和线性混合模型估计值的影响。我们举例说明了这些方法在估计促红细胞生成素对血红蛋白水平影响方面的应用。我们的研究结果为选择合适的方法来对抗主要类型的未测量混杂因素提供了指导,并有助于在存在未测量混杂因素的情况下解释统计结果。