Shirazi Mohammadali, Lord Dominique, Dhavala Soma Sekhar, Geedipally Srinivas Reddy
Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
Perceptron Learning Solutions Pvt Ltd, Bengaluru, India.
Accid Anal Prev. 2016 Jun;91:10-8. doi: 10.1016/j.aap.2016.02.020. Epub 2016 Mar 3.
Crash data can often be characterized by over-dispersion, heavy (long) tail and many observations with the value zero. Over the last few years, a small number of researchers have started developing and applying novel and innovative multi-parameter models to analyze such data. These multi-parameter models have been proposed for overcoming the limitations of the traditional negative binomial (NB) model, which cannot handle this kind of data efficiently. The research documented in this paper continues the work related to multi-parameter models. The objective of this paper is to document the development and application of a flexible NB generalized linear model with randomly distributed mixed effects characterized by the Dirichlet process (NB-DP) to model crash data. The objective of the study was accomplished using two datasets. The new model was compared to the NB and the recently introduced model based on the mixture of the NB and Lindley (NB-L) distributions. Overall, the research study shows that the NB-DP model offers a better performance than the NB model once data are over-dispersed and have a heavy tail. The NB-DP performed better than the NB-L when the dataset has a heavy tail, but a smaller percentage of zeros. However, both models performed similarly when the dataset contained a large amount of zeros. In addition to a greater flexibility, the NB-DP provides a clustering by-product that allows the safety analyst to better understand the characteristics of the data, such as the identification of outliers and sources of dispersion.
碰撞数据通常具有过度离散、重(长)尾以及许多值为零的观测值等特征。在过去几年中,少数研究人员已开始开发和应用新颖创新的多参数模型来分析此类数据。提出这些多参数模型是为了克服传统负二项式(NB)模型的局限性,传统负二项式模型无法有效处理这类数据。本文记录的研究延续了与多参数模型相关的工作。本文的目的是记录一种灵活的具有狄利克雷过程(NB-DP)特征的随机分布混合效应的NB广义线性模型的开发和应用,以对碰撞数据进行建模。该研究目标通过使用两个数据集得以实现。将新模型与NB模型以及最近基于NB和林德利(NB-L)分布混合引入的模型进行了比较。总体而言,研究表明,一旦数据过度离散且具有重尾,NB-DP模型比NB模型具有更好的性能。当数据集具有重尾但零值百分比较小时,NB-DP的表现优于NB-L。然而,当数据集中包含大量零值时,两种模型的表现相似。除了具有更大的灵活性外,NB-DP还提供了一个聚类副产品,使安全分析师能够更好地理解数据的特征,例如异常值的识别和离散源。