Weaver Colin G, Ravani Pietro, Oliver Matthew J, Austin Peter C, Quinn Robert R
Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada.
Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada Department of Medicine, University of Calgary, Calgary, Alberta, Canada.
Nephrol Dial Transplant. 2015 Aug;30(8):1244-9. doi: 10.1093/ndt/gfv071. Epub 2015 Mar 25.
Poisson regression is commonly used to analyze hospitalization data when outcomes are expressed as counts (e.g. number of days in hospital). However, data often violate the assumptions on which Poisson regression is based. More appropriate extensions of this model, while available, are rarely used.
We compared hospitalization data between 206 patients treated with hemodialysis (HD) and 107 treated with peritoneal dialysis (PD) using Poisson regression and compared results from standard Poisson regression with those obtained using three other approaches for modeling count data: negative binomial (NB) regression, zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression. We examined the appropriateness of each model and compared the results obtained with each approach.
During a mean 1.9 years of follow-up, 183 of 313 patients (58%) were never hospitalized (indicating an excess of 'zeros'). The data also displayed overdispersion (variance greater than mean), violating another assumption of the Poisson model. Using four criteria, we determined that the NB and ZINB models performed best. According to these two models, patients treated with HD experienced similar hospitalization rates as those receiving PD {NB rate ratio (RR): 1.04 [bootstrapped 95% confidence interval (CI): 0.49-2.20]; ZINB summary RR: 1.21 (bootstrapped 95% CI 0.60-2.46)}. Poisson and ZIP models fit the data poorly and had much larger point estimates than the NB and ZINB models [Poisson RR: 1.93 (bootstrapped 95% CI 0.88-4.23); ZIP summary RR: 1.84 (bootstrapped 95% CI 0.88-3.84)].
We found substantially different results when modeling hospitalization data, depending on the approach used. Our results argue strongly for a sound model selection process and improved reporting around statistical methods used for modeling count data.
当结果以计数形式表示(如住院天数)时,泊松回归常用于分析住院数据。然而,数据常常违反泊松回归所基于的假设。虽然有更合适的该模型扩展方法,但很少被使用。
我们使用泊松回归比较了206例接受血液透析(HD)治疗的患者和107例接受腹膜透析(PD)治疗的患者的住院数据,并将标准泊松回归的结果与使用其他三种计数数据建模方法获得的结果进行比较:负二项式(NB)回归、零膨胀泊松(ZIP)回归和零膨胀负二项式(ZINB)回归。我们检查了每个模型的适用性,并比较了每种方法获得的结果。
在平均1.9年的随访期间,313例患者中有183例(58%)从未住院(表明“零值”过多)。数据还显示过度离散(方差大于均值),这违反了泊松模型的另一个假设。使用四个标准,我们确定NB和ZINB模型表现最佳。根据这两个模型,接受HD治疗的患者与接受PD治疗的患者的住院率相似{NB率比(RR):1.04[自抽样95%置信区间(CI):0.49 - 2.20];ZINB汇总RR:1.21(自抽样95%CI 0.60 - 2.46)}。泊松模型和ZIP模型对数据的拟合较差,且点估计值比NB和ZINB模型大得多[泊松RR:1.93(自抽样95%CI 0.88 - 4.23);ZIP汇总RR:1.84(自抽样95%CI 0.88 - 3.84)]。
我们发现,根据所使用的方法,在对住院数据进行建模时会得到截然不同的结果。我们的结果有力地支持了一个合理的模型选择过程,并改进围绕计数数据建模所使用的统计方法的报告。