Zachary Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, 3136 TAMU, College Station, TX 77843-3136, United States.
Department of Civil and Environmental Engineering, University of Maine, Orono, ME, 04469, United States.
Accid Anal Prev. 2024 Nov;207:107711. doi: 10.1016/j.aap.2024.107711. Epub 2024 Jul 30.
Crash counts are non-negative integer events often analyzed using crash frequency models such as the negative binomial (NB) distribution. However, due to their random and infrequent nature, crash data usually exhibit unique characteristics, such as excess zero observations that the NB distribution cannot adequately model. The negative binomial-Lindley (NBL) and random parameters negative binomial-Lindley (RPNBL) models have been proposed to address this limitation. Despite addressing the issues of excess zero observations, these models may not fully account for unobserved heterogeneity resulting from temporal variations in crash data. In addition, many variables, such as traffic volume, speed, and weather, change with time. Therefore, the analyst often requires disaggregated data to account for their variations. For example, it is recommended to use monthly crash datasets to better account for temporally varying weather variables compared to yearly crash data. Using temporally disaggregated data not only adds the complexity of the temporal variations issue in data but also compounds the issue of excess zero observations. To address these issues, this paper introduces a new variant of the NBL model with coefficients and Lindley parameters that vary by time. The derivations and characteristics of the model are discussed. Then, the model is illustrated using a simulation study. Subsequently, the model is applied to two empirical crash datasets collected on rural principal and minor arterial roads in Texas. These datasets include several time-dependent variables such as monthly traffic volume, standard deviation of speed, and precipitation and exhibit unique characteristics such as excess zero observations. The results of several goodness-of-fit (GOF) measures indicate that using the NBL model with time-dependent parameters enhances the model fit compared to the NB, NBL, and the NB model with time-dependent parameters. Findings derived from crash data collected from both rural minor and principal arterial roads in Texas suggest that the variables denoting the median presence and wider shoulder width are associated with a potential decrease in crash occurrences. Moreover, higher variations in speed and wider road surfaces are linked to a potential increase in crash frequency. Similarly, a higher monthly average daily traffic (Monthly ADT) positively correlates with crash frequency. We also found that it is important to account for temporal variations using time-dependent parameters.
碰撞次数是指非负整数事件,通常使用负二项式(NB)分布等碰撞频率模型进行分析。然而,由于其随机性和罕见性,碰撞数据通常表现出独特的特征,例如 NB 分布无法充分建模的过量零观测。已经提出了负二项式-林德利(NBL)和随机参数负二项式-林德利(RPNBL)模型来解决这个问题。尽管这些模型解决了过量零观测的问题,但它们可能无法完全解释由于碰撞数据的时间变化而导致的未观测到的异质性。此外,许多变量,如交通量、速度和天气,随时间而变化。因此,分析师通常需要分解数据以考虑它们的变化。例如,建议使用月度碰撞数据集来更好地考虑与年度碰撞数据相比,时间变化的天气变量。使用时间分解数据不仅增加了数据中时间变化问题的复杂性,而且还加剧了过量零观测的问题。为了解决这些问题,本文引入了一种新的 NBL 模型变体,其系数和林德利参数随时间变化。讨论了模型的推导和特征。然后,使用模拟研究说明了该模型。随后,将该模型应用于在德克萨斯州收集的两条关于农村主要和次要干道的实证碰撞数据集。这些数据集包含几个时间相关变量,如每月交通量、速度标准差和降水,并表现出独特的特征,如过量零观测。几个拟合优度(GOF)度量的结果表明,与 NB、NBL 和具有时间相关参数的 NB 模型相比,使用具有时间相关参数的 NBL 模型可以提高模型拟合度。从德克萨斯州农村次要和主要动脉道路收集的碰撞数据中得出的结果表明,指示中位数存在和更宽路肩宽度的变量与潜在的碰撞次数减少有关。此外,速度和更宽的路面变化较大与潜在的碰撞频率增加有关。同样,更高的每月平均日交通量(Monthly ADT)与碰撞频率呈正相关。我们还发现,使用时间相关参数考虑时间变化非常重要。