Department of Civil and Environmental Engineering, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia.
College of Metropolitan Transportation, Beijing University of Technology, Beijing, China.
Int J Inj Contr Saf Promot. 2021 Dec;28(4):408-427. doi: 10.1080/17457300.2021.1928233. Epub 2021 Jun 1.
A better understanding of injury severity risk factors is fundamental to improving crash prediction and effective implementation of appropriate mitigation strategies. Traditional statistical models widely used in this regard have predefined correlation and intrinsic assumptions, which, if flouted, may yield biased predictions. The present study investigates the possibility of using the eXtreme Gradient Boosting (XGBoost) model compared with few traditional machine learning algorithms (logistic regression, random forest, and decision tree) for crash injury severity analysis. The data used in this study was obtained from the traffic safety department, ministry of transport (MOT) at Riyadh, KSA, and contains 13,546 motor vehicle collisions along 15 rural highways reported between January 2017 to December 2019. Empirical results obtained using k-fold (k = 10) for various performance metrics showed that the XGBoost technique outperformed other models in terms of the collective predictive performance as well as injury severity individual class accuracies. XGBoost feature importance analysis indicated that collision type, weather status, road surface conditions, on-site damage type, lighting conditions, and vehicle type are the few sensitive variables in predicting the crash injury severity outcome. Finally, a comparative analysis of XGBoost based on different performance statistics showed that our model outperformed most previous studies.
更好地理解伤害严重程度的风险因素对于改进碰撞预测和有效实施适当的缓解策略至关重要。传统的统计模型在这方面被广泛应用,它们具有预先定义的相关性和内在假设,如果违反这些假设,可能会产生有偏差的预测。本研究探讨了使用极端梯度提升 (XGBoost) 模型与少数传统机器学习算法(逻辑回归、随机森林和决策树)进行碰撞伤害严重程度分析的可能性。本研究使用的数据来自沙特阿拉伯利雅得交通部交通安全部门,包含 2017 年 1 月至 2019 年 12 月期间报告的 15 条农村公路上的 13546 起机动车碰撞事故。使用 k 折交叉验证 (k = 10) 进行各种性能指标的实证结果表明,XGBoost 技术在总体预测性能以及伤害严重程度个别类别准确率方面优于其他模型。XGBoost 特征重要性分析表明,碰撞类型、天气状况、路面状况、现场损坏类型、照明条件和车辆类型是预测碰撞伤害严重程度结果的少数敏感变量。最后,基于不同性能统计数据的 XGBoost 比较分析表明,我们的模型优于大多数先前的研究。