Miaou S P
Center for Transportation Analysis, Oak Ridge National Laboratory, TN 37831.
Accid Anal Prev. 1994 Aug;26(4):471-82. doi: 10.1016/0001-4575(94)90038-8.
This paper evaluates the performance of Poisson and negative binomial (NB) regression models in establishing the relationship between truck accidents and geometric design of road sections. Three types of models are considered: Poisson regression, zero-inflated Poisson (ZIP) regression, and NB regression. Maximum likelihood (ML) method is used to estimate the unknown parameters of these models. Two other feasible estimators for estimating the dispersion parameter in the NB regression model are also examined: a moment estimator and a regression-based estimator. These models and estimators are evaluated based on their (i) estimated regression parameters, (ii) overall goodness-of-fit, (iii) estimated relative frequency of truck accident involvements across road sections, (iv) sensitivity to the inclusion of short road sections, and (v) estimated total number of truck accident involvements. Data from the Highway Safety Information System are employed to examine the performance of these models in developing such relationships. The evaluation results suggest that the NB regression model estimated using the moment and regression-based methods should be used with caution. Also, under the ML method, the estimated regression parameters from all three models are quite consistent and no particular model outperforms the other two models in terms of the estimated relative frequencies of truck accident involvements across road sections. It is recommended that the Poisson regression model be used as an initial model for developing the relationship. If the overdispersion of accident data is found to be moderate or high, both the NB and ZIP regression models could be explored. Overall, the ZIP regression model appears to be a serious candidate model when data exhibit excess zeros, e.g. due to underreporting. However, the interpretation of the ZIP model can be difficult.
本文评估了泊松回归模型和负二项式(NB)回归模型在建立卡车事故与路段几何设计之间关系方面的性能。考虑了三种类型的模型:泊松回归、零膨胀泊松(ZIP)回归和NB回归。采用最大似然(ML)方法估计这些模型的未知参数。还研究了用于估计NB回归模型中离散参数的另外两种可行估计器:矩估计器和基于回归的估计器。基于以下方面对这些模型和估计器进行评估:(i)估计的回归参数;(ii)整体拟合优度;(iii)各路段卡车事故参与的估计相对频率;(iv)对纳入短路段的敏感性;(v)卡车事故参与的估计总数。利用公路安全信息系统的数据来检验这些模型在建立此类关系方面的性能。评估结果表明,使用矩估计法和基于回归的方法估计的NB回归模型应谨慎使用。此外,在ML方法下,所有三种模型估计的回归参数相当一致,就各路段卡车事故参与的估计相对频率而言,没有一个特定模型优于其他两个模型。建议将泊松回归模型用作建立这种关系的初始模型。如果发现事故数据的过度离散为中度或高度,则可以探索NB回归模型和ZIP回归模型。总体而言,当数据存在过多零值(例如由于报告不足)时,ZIP回归模型似乎是一个有力的候选模型。然而,ZIP模型的解释可能会很困难。