Xu Tao, Zhu Guangjin, Han Shaomei
Department of Epidemiology and Statistics, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China.
Department of Physiopathology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China.
BMJ Open. 2017 Nov 28;7(11):e016471. doi: 10.1136/bmjopen-2017-016471.
The number of depression symptoms can be considered as count data in order to get complete and accurate analyses findings in studies of depression. This study aims to compare the goodness of fit of four count outcomes models by a large survey sample to identify the optimum model for a risk factor study of the number of depression symptoms.
15 820 subjects, aged 10 to 80 years old, who were not suffering from serious chronic diseases and had not run a high fever in the past 15 days, agreed to take part in this survey; 15 462 subjects completed all the survey scales. The number of depression symptoms was the sum of the 'positive' responses of seven depression questions. Four count outcomes models and a logistic model were constructed to identify the optimum model of the number of depression symptoms.
The mean number of depression symptoms was 1.37±1.55. The over-dispersion test statistic was 308.011. The alpha dispersion parameter was 0.475 (95% CI 0.443 to 0.508), which was significantly larger than 0. The Vuong test statistic was 6.782 and the P value was <0.001, which showed that there were too many zero counts to be accounted for with traditional negative binomial distribution. The zero-inflated negative binomial (ZINB) model had the largest log likelihood and smallest AIC and BIC, suggesting best goodness of fit. In addition, predictive probabilities for many counts in the ZINB model fitted the observed counts best.
All fitting test statistics and the predictive probability curve produced the same findings that the ZINB model was the best model for fitting the number of depression symptoms, assessing both the presence or absence of depression and its severity.
在抑郁症研究中,为了获得完整准确的分析结果,可将抑郁症状数量视为计数数据。本研究旨在通过一个大型调查样本比较四种计数结果模型的拟合优度,以确定用于抑郁症状数量危险因素研究的最佳模型。
15820名年龄在10至80岁之间、未患有严重慢性疾病且在过去15天内未发过高烧的受试者同意参与本调查;15462名受试者完成了所有调查量表。抑郁症状数量为七个抑郁问题“阳性”回答的总和。构建了四种计数结果模型和一个逻辑模型,以确定抑郁症状数量的最佳模型。
抑郁症状的平均数量为1.37±1.55。过度分散检验统计量为308.011。α分散参数为0.475(95%CI 0.443至0.508),显著大于0。Vuong检验统计量为6.782,P值<0.001,表明零计数过多,无法用传统负二项分布解释。零膨胀负二项(ZINB)模型具有最大的对数似然值和最小的AIC及BIC,表明拟合优度最佳。此外,ZINB模型中许多计数的预测概率与观察计数拟合得最好。
所有拟合检验统计量和预测概率曲线都得出了相同的结果,即ZINB模型是拟合抑郁症状数量的最佳模型,可同时评估抑郁的存在与否及其严重程度。