Wang Joanna J J, Bartlett Mark, Ryan Louise
School of Mathematical and Physical Sciences, University of Technology Sydney, Ultimo, NSW, Australia.
The Sax Institute, Sydney, NSW, Australia.
Stat Med. 2017 Aug 30;36(19):3005-3021. doi: 10.1002/sim.7349. Epub 2017 Jun 2.
Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd.
在观察性研究中,无应答和数据缺失很常见。忽略或处理不当缺失数据可能导致参数估计有偏差、标准误差不正确,进而导致统计推断和结论错误。我们提出了一种对不可忽略的缺失情况进行建模的策略,其中无应答的概率取决于结果。通过一个简单的逻辑回归案例,我们量化了回归估计中的偏差,并表明在不可忽略的缺失数据机制下,观察到的似然性是无法识别的。然后,我们采用联合分布的选择模型分解作为敏感性分析的基础,以研究估计参数的变化以及研究结论针对不同假设的稳健性。使用贝叶斯框架进行模型估计,因为它为纳入不同的缺失数据假设和进行敏感性分析提供了一种灵活的方法。利用模拟数据,我们探讨了贝叶斯选择模型在纠正逻辑回归偏差方面的性能。然后,我们使用来自“45及以上研究”的调查数据实施我们的策略,以调查从基线调查到随访调查期间与健康恶化相关的因素。我们的研究结果对于使用“45及以上研究”数据来回答与健康和生活质量相关的重要研究问题具有实际意义。版权所有© 2017约翰·威利父子有限公司。