Tchetgen Eric J Tchetgen, Wang Linbo, Sun BaoLuo
Department of Biostatistics, Harvard University.
Stat Sin. 2018 Oct;28(4):2069-2088. doi: 10.5705/ss.202016.0325.
Nonmonotone missing data arise routinely in empirical studies of social and health sciences, and when ignored, can induce selection bias and loss of efficiency. In practice, it is common to account for nonresponse under a missing-at-random assumption which although convenient, is rarely appropriate when nonresponse is nonmonotone. Likelihood and Bayesian missing data methodologies often require specification of a parametric model for the full data law, thus ruling out any prospect for semiparametric inference. In this paper, we propose an all-purpose approach which delivers semiparametric inferences when missing data are nonmonotone and not at random. The approach is based on a discrete choice model (DCM) as a means to generate a large class of nonmonotone nonresponse mechanisms that are nonignorable. Sufficient conditions for nonparametric identification are given, and a general framework for fully parametric and semiparametric inference under an arbitrary DCM is proposed. Special consideration is given to the case of logit discrete choice nonresponse model (LDCM) for which we describe generalizations of inverse-probability weighting, pattern-mixture estimation, doubly robust estimation and multiply robust estimation.
非单调缺失数据在社会科学和健康科学的实证研究中经常出现,如果被忽视,可能会导致选择偏差和效率损失。在实践中,通常在随机缺失假设下处理无应答情况,虽然这很方便,但当无应答是非单调时,这种假设很少适用。似然法和贝叶斯缺失数据方法通常需要为完整数据律指定一个参数模型,从而排除了半参数推断的任何可能性。在本文中,我们提出了一种通用方法,当缺失数据是非单调且非随机时,该方法可进行半参数推断。该方法基于离散选择模型(DCM),作为生成一大类不可忽视的非单调无应答机制的一种手段。给出了非参数识别的充分条件,并提出了在任意DCM下进行完全参数推断和半参数推断的通用框架。特别考虑了对数离散选择无应答模型(LDCM)的情况,我们描述了逆概率加权、模式混合估计、双重稳健估计和多重稳健估计的推广。