Gao Sujuan
Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis 46202-2872, USA.
Stat Med. 2004 Jan 30;23(2):211-9. doi: 10.1002/sim.1710.
A significant source of missing data in longitudinal epidemiologic studies on elderly individuals is death. It is generally believed that these missing data by death are non-ignorable to likelihood based inference. Inference based on data only from surviving participants in the study may lead to biased results. In this paper we model both the probability of disease and the probability of death using shared random effect parameters. We also propose to use the Laplace approximation for obtaining an approximate likelihood function so that high dimensional integration over the distributions of the random effect parameters is not necessary. Parameter estimates can be obtained by maximizing the approximate log-likelihood function. Data from a longitudinal dementia study will be used to illustrate the approach. A small simulation is conducted to compare parameter estimates from the proposed method to the 'naive' method where missing data is considered at random.
在针对老年人的纵向流行病学研究中,数据缺失的一个重要来源是死亡。一般认为,因死亡导致的这些缺失数据对于基于似然性的推断而言是不可忽略的。仅基于研究中存活参与者的数据进行推断可能会导致有偏差的结果。在本文中,我们使用共享随机效应参数对疾病概率和死亡概率进行建模。我们还提议使用拉普拉斯近似来获得近似似然函数,这样就无需对随机效应参数的分布进行高维积分。通过最大化近似对数似然函数可以获得参数估计值。将使用一项纵向痴呆症研究的数据来说明该方法。进行了一个小型模拟,以比较所提出方法与将缺失数据视为随机缺失的“简单”方法的参数估计值。