Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE 68198, U.S.A.
Stat Med. 2013 Nov 30;32(27):4763-80. doi: 10.1002/sim.5875. Epub 2013 Jun 7.
Missing data is a very common problem in medical and social studies, especially when data are collected longitudinally. It is a challenging problem to utilize observed data effectively. Many papers on missing data problems can be found in statistical literature. It is well known that the inverse weighted estimation is neither efficient nor robust. On the other hand, the doubly robust (DR) method can improve the efficiency and robustness. As is known, the DR estimation requires a missing data model (i.e., a model for the probability that data are observed) and a working regression model (i.e., a model for the outcome variable given covariates and surrogate variables). Because the DR estimating function has mean zero for any parameters in the working regression model when the missing data model is correctly specified, in this paper, we derive a formula for the estimator of the parameters of the working regression model that yields the optimally efficient estimator of the marginal mean model (the parameters of interest) when the missing data model is correctly specified. Furthermore, the proposed method also inherits the DR property. Simulation studies demonstrate the greater efficiency of the proposed method compared with the standard DR method. A longitudinal dementia data set is used for illustration.
在医学和社会研究中,缺失数据是一个非常常见的问题,特别是当数据是纵向收集的。有效地利用观察数据是一个具有挑战性的问题。在统计文献中可以找到许多关于缺失数据问题的论文。众所周知,逆加权估计既没有效率也不稳健。另一方面,双稳健(DR)方法可以提高效率和稳健性。众所周知,DR 估计需要缺失数据模型(即数据观测概率的模型)和工作回归模型(即给定协变量和替代变量的结果变量的模型)。由于在缺失数据模型正确指定的情况下,DR 估计函数对于工作回归模型中的任何参数的均值为零,因此在本文中,我们推导出了一个公式,用于估计工作回归模型的参数,该公式在缺失数据模型正确指定的情况下,对于边缘均值模型(感兴趣的参数)产生最优有效的估计器。此外,所提出的方法还继承了 DR 特性。模拟研究表明,与标准 DR 方法相比,所提出的方法具有更高的效率。使用纵向痴呆数据集进行说明。