Mahmud Asif, Gayah Vikash V, Paleti Rajesh
Department of Civil and Environmental Engineering, The Pennsylvania State University, 231 Sackett Building, University Park, PA 16802, United States.
Accid Anal Prev. 2023 May;184:106998. doi: 10.1016/j.aap.2023.106998. Epub 2023 Feb 11.
Crash misclassification (MC) - e.g., a crash of one type or severity being mistakenly miscategorized as another - is a relatively common problem in transportation safety. Crash frequency models for individual crash categories estimated using datasets with MC errors could result in biased parameter estimates and thus lead to ineffective countermeasure planning. This study proposes a novel methodological formulation to directly account for this MC error and incorporates it into the two most common count data models used for crash frequency prediction: Poisson and Negative Binomial (NB) regression. The proposed framework introduces probabilistic MC rates among different crash types and modifies the likelihood function of the count models accordingly. The paper also demonstrates how this approach can be integrated into reformulated models that express each count model as a discrete choice model. The capability of the proposed models to estimate true parameters, given the existence of MC error, is examined via simulation analysis. Then, the proposed models are applied to empirical data to examine the presence of MC in crash data and further examine the robustness of the proposed models. Although the MC rates are found to be very low in the empirical data, the fit of proposed models are found to be better compared to the models that ignore MC error and thus likely provide more reliable parameter estimates.
碰撞误分类(MC)——例如,一种类型或严重程度的碰撞被错误地归类为另一种——是运输安全中一个相对常见的问题。使用存在MC误差的数据集估计的单个碰撞类别的碰撞频率模型可能会导致参数估计有偏差,从而导致无效的对策规划。本研究提出了一种新颖的方法公式,以直接考虑这种MC误差,并将其纳入用于碰撞频率预测的两种最常见的计数数据模型:泊松回归和负二项式(NB)回归。所提出的框架引入了不同碰撞类型之间的概率性MC率,并相应地修改了计数模型的似然函数。本文还展示了如何将这种方法整合到将每个计数模型表示为离散选择模型的重新制定的模型中。通过模拟分析检验了所提出模型在存在MC误差情况下估计真实参数的能力。然后,将所提出的模型应用于实证数据,以检验碰撞数据中MC的存在情况,并进一步检验所提出模型的稳健性。尽管在实证数据中发现MC率非常低,但与忽略MC误差的模型相比,所提出模型的拟合效果更好,因此可能提供更可靠的参数估计。