Wu P, Tu X M, Kowalski J
Department of Biostatistics and Computational Biology, Rochester, NY, 14623, U.S.A.
Stat Med. 2014 Jan 15;33(1):143-57. doi: 10.1002/sim.5908. Epub 2013 Jul 30.
The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.
广义估计方程(GEE)是一种用于对纵向数据进行建模的无分布或半参数方法,广泛应用于各种行为、心理治疗、药物安全性以及与医疗保健相关的研究中。大多数评估模型拟合的常用方法基于参数模型的似然函数,因此不适用于无分布的GEE。一个罕见的例外是Tsiatis最初为逻辑回归提出的得分统计量(1980年),后来Barnhart和Williamson将其扩展到GEE(1998年)。由于GEE仅在完全随机缺失假设下提供有效的推断,而大多数纵向研究中出现的缺失值并不遵循这种受限机制,因此基于GEE的得分检验在实际应用中非常有限。我们提出了这种拟合优度检验的扩展方法,以解决随机缺失假设下的缺失数据问题,这是一个更符合实际情况的模型,适用于大多数实际研究。我们使用模拟数据检验了所提出检验的性能,并通过一项关于老年抑郁症及相关合并症的真实研究数据展示了此类检验的效用。