Leidos, Inc, 140 Sylvester Road, San Diego, CA, 92106, USA.
Deployment Health Research Department, Naval Health Research Center, 140 Sylvester Road, San Diego, CA, 92106, USA.
BMC Med Res Methodol. 2021 Jan 6;21(1):5. doi: 10.1186/s12874-020-01158-w.
Questionnaires used in longitudinal studies may have questions added or removed over time for numerous reasons. Data missing completely at a follow-up survey is a unique issue for longitudinal studies. While such excluded questions lack information at one follow-up survey, they are collected at other follow-up surveys, and covariances observed at other follow-up surveys may allow for the recovery of the missing data. This study utilized data from a large longitudinal cohort study to assess the efficiency and feasibility of using multiple imputation (MI) to recover this type of information.
Millennium Cohort Study participants completed the 9-item Patient Health Questionnaire (PHQ) depression module at 2 time points (2004, 2007). The suicidal ideation item in the module was set to missing for the 2007 assessment. Several single-level MI models using different sets of predictors and forms of suicidal ideation were used to compare self-reported values and imputed values for this item in 2007. Additionally, associations with sleep duration and smoking status, which are related constructs, were compared between self-reported and imputed values of suicidal ideation.
Among 63,028 participants eligible for imputation analysis, 4.05% reported suicidal ideation on the 2007 survey. The imputation models successfully identified suicidal ideation, with a sensitivity ranging between 34 and 66% and a positive predictive value between 36 and 42%. Specificity remained above 96% and negative predictive value above 97% for all imputed models. Similar associations were found for all imputation models on related constructs, though the dichotomous suicidal ideation imputed from the model using only PHQ depression items yielded estimates that were closest with the self-reported associations for all adjusted analyses.
Although sensitivity and positive predictive value were relatively low, applying MI techniques allowed for inclusion of an otherwise missing variable. Additionally, correlations with related constructs were estimated near self-reported values. Therefore, the other 8 depression items can be used to estimate suicidal ideation that was completely missing from a survey using MI. However, these imputed values should not be used to estimate population prevalence.
由于诸多原因,纵向研究中使用的问卷可能会随着时间的推移而增加或删除问题。在后续调查中完全缺失的数据是纵向研究的一个独特问题。虽然在一个后续调查中缺少这些被排除的问题的数据,但它们会在其他后续调查中收集,并且在其他后续调查中观察到的协变量可以恢复缺失的数据。本研究利用来自大型纵向队列研究的数据,评估使用多项插补(MI)恢复此类信息的效率和可行性。
千禧年队列研究的参与者在两个时间点(2004 年和 2007 年)完成了 9 项患者健康问卷(PHQ)抑郁模块。该模块中的自杀意念项在 2007 年的评估中被设置为缺失。使用几种单水平 MI 模型,使用不同的预测因子集和自杀意念形式,来比较 2007 年该项目的自我报告值和插补值。此外,还比较了与睡眠持续时间和吸烟状况相关的自杀意念的自我报告值和插补值之间的关联,这是相关的结构。
在 63028 名符合插补分析条件的参与者中,有 4.05%在 2007 年的调查中报告了自杀意念。插补模型成功地识别了自杀意念,敏感性在 34%至 66%之间,阳性预测值在 36%至 42%之间。所有插补模型的特异性均保持在 96%以上,阴性预测值均保持在 97%以上。对于所有插补模型,相关结构的关联也相似,尽管仅使用 PHQ 抑郁项目构建的模型插补的二分自杀意念估计值在所有调整分析中与自我报告的关联最接近。
尽管敏感性和阳性预测值相对较低,但应用 MI 技术允许包含一个原本缺失的变量。此外,与相关结构的相关性接近自我报告值。因此,使用 MI 技术可以从一个完全缺失的调查中估计其他 8 个抑郁项目的自杀意念。然而,这些插补值不应用于估计人口患病率。