Beijing Key Laboratory of Traffic Engineering, Beijing University of Technology, Beijing 100124, China.
Department of Civil and Environmental Engineering, University of North Carolina at Charlotte, EPIC Building, Room 3366, 9201 University City Boulevard, Charlotte, NC 28223-0001, United States.
Accid Anal Prev. 2024 Nov;207:107740. doi: 10.1016/j.aap.2024.107740. Epub 2024 Aug 13.
The causes of traffic violations by elderly drivers are different from those of other age groups. To reduce serious traffic violations that are more likely to cause serious traffic crashes, this study divided the severity of traffic violations into three levels (i.e., slight, ordinary, severe) based on point deduction, and explore the patterns of serious traffic violations (i.e., ordinary, severe) using multi-source data. This paper designed an interpretable machine learning framework, in which four popular machine learning models were enhanced and compared. Specifically, adaptive synthetic sampling method was applied to overcome the effects of imbalanced data and improve the prediction accuracy of minority classes (i.e., ordinary, severe); multi-objective feature selection based on NSGA-II was used to remove the redundant factors to increase the computational efficiency and make the patterns discovered by the explainer more effective; Bayesian hyperparameter optimization aimed to obtain more effective hyperparameters combination with fewer iterations and boost the model adaptability. Results show that the proposed interpretable machine learning framework can significantly improve and distinguish the performance of four popular machine learning models and two post-hoc interpretation methods. It is found that six of the top ten important factors belong to multi-scale built environment attributes. By comparing the results of feature contribution and interaction effects, some findings can be summarized: ordinary and severe traffic violations have some identical influencing factors and interactive effects; have the same influencing factors or the same combinations of influencing factors, but the values of the factors are different; have some unique influencing factors and unique combinations of influencing factors.
老年驾驶员交通违法行为的原因与其他年龄段不同。为减少可能导致严重交通事故的严重交通违法行为,本研究根据扣分将交通违法行为的严重程度分为三个等级(轻微、普通、严重),并利用多源数据探索严重交通违法行为(普通、严重)的模式。本文设计了一个可解释的机器学习框架,其中增强和比较了四种流行的机器学习模型。具体来说,自适应合成采样方法被应用于克服不平衡数据的影响,提高少数类别的预测精度(即普通、严重);基于 NSGA-II 的多目标特征选择用于去除冗余因素,提高计算效率,并使解释器发现的模式更有效;贝叶斯超参数优化旨在通过较少的迭代次数获得更有效的超参数组合,并提高模型的适应性。结果表明,所提出的可解释机器学习框架可以显著提高和区分四种流行的机器学习模型和两种事后解释方法的性能。研究发现,十大重要因素中有六个属于多尺度建成环境属性。通过比较特征贡献和交互效应的结果,可以总结出一些发现:普通和严重的交通违法行为有一些相同的影响因素和交互效应;有相同的影响因素或相同的影响因素组合,但因素的值不同;有一些独特的影响因素和独特的影响因素组合。