Li Wei, Miao Wang, Tchetgen Tchetgen Eric
Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, P.R. China.
Department of Probability and Statistics, Peking University, Beijing, P.R. China.
J R Stat Soc Series B Stat Methodol. 2023 May 8;85(3):913-935. doi: 10.1093/jrsssb/qkad047. eCollection 2023 Jul.
We consider identification and inference about mean functionals of observed covariates and an outcome variable subject to non-ignorable missingness. By leveraging a shadow variable, we establish a necessary and sufficient condition for identification of the mean functional even if the full data distribution is not identified. We further characterize a necessary condition for -estimability of the mean functional. This condition naturally strengthens the identifying condition, and it requires the existence of a function as a solution to a representer equation that connects the shadow variable to the mean functional. Solutions to the representer equation may not be unique, which presents substantial challenges for non-parametric estimation, and standard theories for non-parametric sieve estimators are not applicable here. We construct a consistent estimator of the solution set and then adapt the theory of extremum estimators to find from the estimated set a consistent estimator of an appropriately chosen solution. The estimator is asymptotically normal, locally efficient and attains the semi-parametric efficiency bound under certain regularity conditions. We illustrate the proposed approach via simulations and a real data application on home pricing.
我们考虑对观测协变量的均值泛函以及受不可忽略缺失影响的结果变量进行识别和推断。通过利用一个影子变量,即使全数据分布未被识别,我们也为均值泛函的识别建立了一个充要条件。我们进一步刻画了均值泛函可估计性的一个必要条件。该条件自然地强化了识别条件,并且它要求存在一个函数作为连接影子变量和均值泛函的表示方程的解。表示方程的解可能不唯一,这给非参数估计带来了重大挑战,并且非参数筛估计量的标准理论在此处不适用。我们构造了解集的一个一致估计量,然后采用极值估计量理论从估计集中找到一个适当选择的解的一致估计量。在某些正则条件下,该估计量是渐近正态的、局部有效的并且达到半参数效率界。我们通过模拟和一个关于房价的实际数据应用来说明所提出的方法。