Rostami Mehdi, Saarela Olli
Dalla Lana School of Public Health, University of Toronto, 155 College st., Toronto, ON M5T 3M7, Canada.
Entropy (Basel). 2022 Jan 25;24(2):179. doi: 10.3390/e24020179.
The estimation of average treatment effect (ATE) as a causal parameter is carried out in two steps, where in the first step, the treatment and outcome are modeled to incorporate the potential confounders, and in the second step, the predictions are inserted into the ATE estimators such as the augmented inverse probability weighting (AIPW) estimator. Due to the concerns regarding the non-linear or unknown relationships between confounders and the treatment and outcome, there has been interest in applying non-parametric methods such as machine learning (ML) algorithms instead. Some of the literature proposes to use two separate neural networks (NNs) where there is no regularization on the network's parameters except the stochastic gradient descent (SGD) in the NN's optimization. Our simulations indicate that the AIPW estimator suffers extensively if no regularization is utilized. We propose the normalization of AIPW (referred to as nAIPW) which can be helpful in some scenarios. nAIPW, provably, has the same properties as AIPW, that is, the double-robustness and orthogonality properties. Further, if the first-step algorithms converge fast enough, under regulatory conditions, nAIPW will be asymptotically normal. We also compare the performance of AIPW and nAIPW in terms of the bias and variance when small to moderate L1 regularization is imposed on the NNs.
作为因果参数的平均治疗效果(ATE)估计分两步进行,第一步,对治疗和结果进行建模以纳入潜在混杂因素,第二步,将预测值代入ATE估计器,如增强逆概率加权(AIPW)估计器。由于担心混杂因素与治疗和结果之间存在非线性或未知关系,人们开始关注应用机器学习(ML)算法等非参数方法。一些文献建议使用两个单独的神经网络(NN),除了NN优化中的随机梯度下降(SGD)外,对网络参数不进行正则化。我们的模拟表明,如果不使用正则化,AIPW估计器会受到很大影响。我们提出了AIPW的归一化(称为nAIPW),这在某些情况下可能会有所帮助。可以证明,nAIPW与AIPW具有相同的性质,即双重稳健性和正交性。此外,如果第一步算法收敛足够快,在正则条件下,nAIPW将渐近正态。我们还比较了在对NN施加小到中等L1正则化时,AIPW和nAIPW在偏差和方差方面的性能。