Wang Joanna J J, Bartlett Mark, Ryan Louise
School of Mathematical and Physical Sciences, University of Technology Sydney, Ultimo, Australia.
The Sax Institute, Sydney, Australia.
BMC Med Res Methodol. 2017 May 8;17(1):80. doi: 10.1186/s12874-017-0355-z.
In longitudinal studies, nonresponse to follow-up surveys poses a major threat to validity, interpretability and generalisation of results. The problem of nonresponse is further complicated by the possibility that nonresponse may depend on the outcome of interest. We identified sociodemographic, general health and wellbeing characteristics associated with nonresponse to the follow-up questionnaire and assessed the extent and effect of nonresponse on statistical inference in a large-scale population cohort study.
We obtained the data from the baseline and first wave of the follow-up survey of the 45 and Up Study. Of those who were invited to participate in the follow-up survey, 65.2% responded. Logistic regression model was used to identify baseline characteristics associated with follow-up response. A Bayesian selection model approach with sensitivity analysis was implemented to model nonignorable nonresponse.
Characteristics associated with a higher likelihood of responding to the follow-up survey include female gender, age categories 55-74, high educational qualification, married/de facto, worked part or partially or fully retired and higher household income. Parameter estimates and conclusions are generally consistent across different assumptions on the missing data mechanism. However, we observed some sensitivity for variables that are strong predictors for both the outcome and nonresponse.
Results indicated in the context of the binary outcome under study, nonresponse did not result in substantial bias and did not alter the interpretation of results in general. Conclusions were still largely robust under nonignorable missing data mechanism. Use of a Bayesian selection model is recommended as a useful strategy for assessing potential sensitivity of results to missing data.
在纵向研究中,对随访调查无应答对结果的有效性、可解释性和普遍性构成重大威胁。无应答问题因无应答可能取决于感兴趣的结果这一可能性而进一步复杂化。我们确定了与对随访问卷无应答相关的社会人口统计学、总体健康和幸福特征,并在一项大规模人群队列研究中评估了无应答对统计推断的程度和影响。
我们从45岁及以上研究的基线和随访调查的第一波中获取数据。在被邀请参加随访调查的人中,65.2%作出了回应。使用逻辑回归模型确定与随访应答相关的基线特征。采用带有敏感性分析的贝叶斯选择模型方法对不可忽视的无应答进行建模。
与随访调查应答可能性较高相关的特征包括女性、55 - 74岁年龄组、高学历、已婚/事实婚姻、兼职工作或部分或完全退休以及较高的家庭收入。在对缺失数据机制的不同假设下,参数估计和结论总体上是一致的。然而,我们观察到对于那些对结果和无应答都是强预测因子的变量存在一些敏感性。
结果表明,在所研究的二元结果背景下,无应答并未导致实质性偏差,总体上也未改变结果的解释。在不可忽视的缺失数据机制下,结论在很大程度上仍然稳健。建议使用贝叶斯选择模型作为评估结果对缺失数据潜在敏感性的有用策略。