Song Xiuguang, Pi Rendong, Zhang Yu, Wu Jianqing, Dong Yuhuan, Zhang Han, Zhu Xinyuan
School of Qilu Transportation, Shandong University, Jinan 250061, China.
Suzhou Research Institute, Shandong University, Suzhou 215123, China.
Int J Environ Res Public Health. 2021 May 15;18(10):5271. doi: 10.3390/ijerph18105271.
Multi-vehicle (MV) crashes, which can lead to great damages to society, have always been a serious issue for traffic safety. A further understanding of crash severity can help transportation engineers identify the critical reasons and find effective countermeasures to improve transportation safety. However, studies involving methods of machine learning to predict the possibility of injury-severity of MV crashes are rarely seen. Besides that, previous studies have rarely taken temporal stability into consideration in MV crashes. To bridge these knowledge gaps, two kinds of models: random parameters logit model (RPL), with heterogeneities in the means and variances, and Random Forest (RF) were employed in this research to identify the critical contributing factors and to predict the possibility of MV injury-severity. Three-year (2016-2018) MV data from Washington, United States, extracted from the Highway Safety Information System (HSIS), were applied for crash injury-severity analysis. In addition, a series of likelihood ratio tests were conducted for temporal stability between different years. Four indicators were employed to measure the prediction performance of the selected models, and four categories of crash-related characteristics were specifically investigated based on the RPL model. The results showed that the machine learning-based models performed better than the statistical models did when taking the overall accuracy as an evaluation indicator. However, the statistical models had a better prediction performance than the machine learning models had considering crash costs. Temporal instabilities were present between 2016 and 2017 MV data. The effect of significant factors was elaborated based on the RPL model with heterogeneities in the means and variances.
多车碰撞事故会给社会带来巨大损失,一直是交通安全领域的严重问题。深入了解碰撞严重程度有助于交通工程师找出关键原因并找到有效的应对措施,以提高交通安全。然而,很少有研究涉及使用机器学习方法来预测多车碰撞事故中受伤严重程度的可能性。除此之外,以往的研究在多车碰撞事故中很少考虑时间稳定性。为了填补这些知识空白,本研究采用了两种模型:均值和方差具有异质性的随机参数logit模型(RPL)和随机森林(RF),以识别关键影响因素并预测多车碰撞受伤严重程度的可能性。从美国华盛顿州高速公路安全信息系统(HSIS)提取的三年(2016 - 2018年)多车碰撞数据被用于碰撞受伤严重程度分析。此外,还对不同年份之间的时间稳定性进行了一系列似然比检验。采用四个指标来衡量所选模型的预测性能,并基于RPL模型具体研究了四类与碰撞相关的特征。结果表明,以整体准确率作为评估指标时,基于机器学习的模型比统计模型表现更好。然而,考虑碰撞成本时,统计模型的预测性能比机器学习模型更好。2016年和2017年的多车碰撞数据之间存在时间不稳定性。基于均值和方差具有异质性的RPL模型阐述了显著因素的影响。