Terza Joseph V, Bradford W David, Dismuke Clara E
Department of Epidemiology and Health Policy Research, University of Florida, 1329 SW 16th Street, Room 5130, PO Box 100147, Gainesville, FL 32610-0147, USA.
Health Serv Res. 2008 Jun;43(3):1102--20. doi: 10.1111/j.1475-6773.2007.00807.x.
To investigate potential bias in the use of the conventional linear instrumental variables (IV) method for the estimation of causal effects in inherently nonlinear regression settings.
Smoking Supplement to the 1979 National Health Interview Survey, National Longitudinal Alcohol Epidemiologic Survey, and simulated data.
Potential bias from the use of the linear IV method in nonlinear models is assessed via simulation studies and real world data analyses in two commonly encountered regression setting: (1) models with a nonnegative outcome (e.g., a count) and a continuous endogenous regressor; and (2) models with a binary outcome and a binary endogenous regressor.
The simulation analyses show that substantial bias in the estimation of causal effects can result from applying the conventional IV method in inherently nonlinear regression settings. Moreover, the bias is not attenuated as the sample size increases. This point is further illustrated in the survey data analyses in which IV-based estimates of the relevant causal effects diverge substantially from those obtained with appropriate nonlinear estimation methods.
We offer this research as a cautionary note to those who would opt for the use of linear specifications in inherently nonlinear settings involving endogeneity.
探讨在本质上为非线性回归的情形下,使用传统线性工具变量(IV)方法估计因果效应时可能存在的偏差。
1979年全国健康访谈调查吸烟补充调查、全国酒精流行病学纵向调查以及模拟数据。
通过模拟研究和对两个常见回归情形的实际数据分析,评估在非线性模型中使用线性IV方法可能产生的偏差:(1)具有非负结果(如计数)和连续内生解释变量的模型;(2)具有二元结果和二元内生解释变量的模型。
模拟分析表明,在本质上为非线性回归的情形下应用传统IV方法,可能会导致因果效应估计中出现显著偏差。此外,偏差不会随着样本量的增加而减弱。在调查数据分析中进一步说明了这一点,其中基于IV的相关因果效应估计与使用适当非线性估计方法获得的估计值有很大差异。
我们进行这项研究是为了给那些在涉及内生性的本质非线性情形下选择使用线性设定的人敲响警钟。