Jiangsu Key Laboratory of Traffic and Transportation Security, Huaiyin Institute of Technology, Huaian 223003, China.
Key Laboratory of Road and Traffic Engineering of the State Ministry of Education, College of Transportation Engineering, Tongji University, Shanghai 201804, China.
Int J Environ Res Public Health. 2021 Nov 3;18(21):11564. doi: 10.3390/ijerph182111564.
In many related works, nominal classification algorithms ignore the order between injury severity levels and make sub-optimal predictions. Existing ordinal classification methods suffer rank inconsistency and rank non-monotonicity. The aim of this paper is to propose an ordinal classification approach to predict traffic crash injury severity and to test its performance over existing machine learning classification methods. First, we compare the performance of the neural network, XGBoost, and SVM classifiers in injury severity prediction. Second, we utilize a severity category-combination method with oversampling to relieve the class-imbalance problem prevalent in crash data. Third, we take advantage of probability calibration and the optimal probability threshold moving to improve the prediction ability of ordinal classification. The proposed approach can satisfy the rank consistency and rank monotonicity requirement and is proved to be superior to other ordinal classification methods and nominal classification machine learning by statistical significance test. Important factors relating to injury severity are selected based on their permutation feature importance scores. We find that converting severity levels into three classes, minor injury, moderate injury, and serious injury, can substantially improve the prediction precision.
在许多相关工作中,名义分类算法忽略了伤害严重程度之间的顺序,从而做出次优的预测。现有的有序分类方法存在等级不一致性和等级非单调性。本文旨在提出一种有序分类方法来预测交通碰撞伤害严重程度,并在现有机器学习分类方法的基础上测试其性能。首先,我们比较了神经网络、XGBoost 和 SVM 分类器在伤害严重程度预测中的性能。其次,我们利用严重程度类别组合方法和过采样来缓解碰撞数据中普遍存在的类别不平衡问题。第三,我们利用概率校准和最优概率阈值移动来提高有序分类的预测能力。所提出的方法可以满足等级一致性和等级单调性的要求,并通过统计显著性检验证明优于其他有序分类方法和名义分类机器学习。基于排列特征重要性得分,选择与伤害严重程度相关的重要因素。我们发现,将严重程度等级转换为轻伤、中度伤和重伤三个等级,可以显著提高预测精度。