Department of Psychology, University of Illinois at Chicago, Chicago, IL, USA.
Nicotine Tob Res. 2010 Apr;12(4):445-8. doi: 10.1093/ntr/ntp213. Epub 2010 Jan 25.
If not handled appropriately, missing data can result in biased estimates and, quite possibly, incorrect conclusions about treatment efficacy. This article aimed to demonstrate how ordinary use of generalized estimating equations (GEE) can be problematic if the assumption of missing completely at random (MCAR) is not met.
We tested whether results differed for different analytic methods depending on whether the MCAR assumption was violated. This example used data from a published randomized controlled trial examining whether varying the timing of a weight management intervention, in concert with smoking cessation, improved cessation rates for adult female smokers. Participants were 284 women with at least one report of smoking status during Visits 4-16. Smoking status was assessed at each visit via self-report and biologically verified using expired carbon monoxide.
Results showed that while the GEE analysis found differences in smoking status between conditions, tests of the MCAR assumption demonstrated that it was not valid for this dataset. Additional analyses using tests that do not require the MCAR assumption found no differences between conditions. Thus, GEE is not an appropriate choice for this analysis.
While GEE is an appropriate technique for analyzing dichotomous data when the MCAR assumption is not violated, weighted GEE or mixed-effects logistic regression are more appropriate when the missing data mechanism is not MCAR.
如果处理不当,缺失数据可能会导致对治疗效果的估计出现偏差,甚至可能得出错误的结论。本文旨在展示如果不符合完全随机缺失(MCAR)的假设,普通使用广义估计方程(GEE)会产生什么问题。
我们测试了不同的分析方法是否会因违反 MCAR 假设而产生不同的结果。这个例子使用了发表的随机对照试验的数据,该试验研究了在与戒烟同时改变体重管理干预的时间是否能提高成年女性吸烟者的戒烟率。参与者是 284 名在第 4-16 次就诊中有至少一次吸烟状况报告的女性。在每次就诊时,通过自我报告和使用呼出的一氧化碳进行生物验证来评估吸烟状况。
结果表明,虽然 GEE 分析发现了条件之间吸烟状况的差异,但 MCAR 假设的检验表明,对于这个数据集,它是无效的。使用不要求 MCAR 假设的检验进行的额外分析发现,条件之间没有差异。因此,GEE 不适合这种分析。
当 MCAR 假设不被违反时,GEE 是分析二分类数据的适当技术,但当缺失数据机制不是 MCAR 时,加权 GEE 或混合效应逻辑回归更合适。