Qasim Muhammad, Månsson Kristofer, Balakrishnan Narayanaswamy
Department of Economics, Finance and Statistics, Jönköping International Business School, Jönköping University, Sweden.
Department of Mathematics and Statistics, McMaster University, Hamilton, Ontario, Canada.
Res Sq. 2024 Dec 4:rs.3.rs-5550004. doi: 10.21203/rs.3.rs-5550004/v1.
The classical best-subset selection method has been demonstrated to be nondeterministic polynomial-time hard and thus presents computational challenges. This problem can now be solved via advanced mixed integer optimization (MIO) algorithms for linear regression. We extend this methodology to linear instrumental variable (IV) regression and propose the best-subset instrumental variable (BSIV) method incorporating the MIO procedure. Classical IV estimation methods assume that IVs must not directly impact the outcome variable and should remain uncorrelated with nonmeasured variables. However, in practice, IVs are likely to be invalid, and existing methods can lead to a large bias relative to standard errors in certain situations. The proposed BSIV estimator is robust in estimating causal effects in the presence of unknown IV validity. We demonstrate that the BSIV using MIO algorithms outperforms two-stage least squares, Lasso-type IVs, and two-sample analysis (median and mode estimators) through Monte Carlo simulations in terms of bias and relative efficiency. We analyze two datasets involving the health-related quality of life index and proximity and the education-wage relationship to demonstrate the utility of the proposed method.
经典的最佳子集选择方法已被证明是NP难问题,因此存在计算挑战。现在可以通过用于线性回归的先进混合整数优化(MIO)算法来解决这个问题。我们将这种方法扩展到线性工具变量(IV)回归,并提出了结合MIO程序的最佳子集工具变量(BSIV)方法。经典的IV估计方法假设IV不能直接影响结果变量,并且应与未测量变量保持不相关。然而,在实践中,IV可能是无效的,并且在某些情况下,现有方法可能导致相对于标准误差的较大偏差。所提出的BSIV估计器在存在未知IV有效性的情况下估计因果效应时具有鲁棒性。我们通过蒙特卡罗模拟证明,使用MIO算法的BSIV在偏差和相对效率方面优于两阶段最小二乘法、套索型IV和两样本分析(中位数和众数估计器)。我们分析了两个数据集,一个涉及与健康相关的生活质量指数和接近程度,另一个涉及教育与工资的关系,以证明所提出方法的实用性。