Han Baorui, Huang Haibo, Li Gen, Jiang Chenming, Yang Zhen, Zhu Zhenjun
School of Automobile and Traffic Engineering, Nanjing Forestry University, Nanjing, Jiangsu, China.
China Institute of FTZ Supply Chain, Shanghai Maritime University, Shanghai, Shanghai, China.
PLoS One. 2025 Jan 3;20(1):e0314939. doi: 10.1371/journal.pone.0314939. eCollection 2025.
A classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained and compared with the state-of-art methods (Classification and Regression Tree, Random Forest, and Logistic Regression). The results show that the GBDT outperforms other methods, achieving the lowest negative log-likelihood (0.282), misclassification rate (0.096), and the highest AUC (0.803). GBDT also demonstrates superior computational efficiency, with a LIFT value of 4.087, making it a more accurate and efficient model for predicting hit-and-run crashes compared to CART, Random Forest, and Logistic Regression. The results obtained from the GBDT show that the relative importance of crash type and relation to trafficway rank 4th and 5th, respectively. Neither is mentioned in previous studies, indicating that GBDT has the ability to mine hidden information. In addition, the interaction between influencing variables can also be obtained to investigate the joint effect of various variables. The results of this study have practical applications in hit-and-run incident prevention, accident safety analysis, and other engineering applications.
基于非线性方法梯度提升决策树(GBDT)建立了一个分类预测模型,以研究肇事逃逸事故中肇事者逃逸行为的影响因素。利用美国碰撞报告抽样系统(CRSS)数据集对该模型进行训练,并与现有先进方法(分类回归树、随机森林和逻辑回归)进行比较。结果表明,GBDT优于其他方法,具有最低的负对数似然值(0.282)、误分类率(0.096)和最高的AUC(0.803)。GBDT还具有卓越的计算效率,LIFT值为4.087,与分类回归树、随机森林和逻辑回归相比,它是一个用于预测肇事逃逸事故更准确、高效的模型。GBDT的结果表明,碰撞类型和与道路的关系的相对重要性分别排在第4位和第5位。以往研究中均未提及这两个因素,表明GBDT具有挖掘隐藏信息的能力。此外,还可以获得影响变量之间的相互作用,以研究各种变量的联合效应。本研究结果在肇事逃逸事故预防、事故安全分析及其他工程应用中具有实际应用价值。