用于增强碰撞预测的异构集成学习——一种基于频率论和机器学习的堆叠框架。

Heterogeneous ensemble learning for enhanced crash forecasts - A frequentist and machine learning based stacking framework.

作者信息

Ahmad Numan, Wali Behram, Khattak Asad J

机构信息

Department of Civil & Environmental Engineering, The University of Tennessee, Knoxville, TN 37996, USA.

Urban Design 4 Health, Inc., 24 Jackie Circle, East Rochester, NY 14612, USA.

出版信息

J Safety Res. 2023 Feb;84:418-434. doi: 10.1016/j.jsr.2022.12.005. Epub 2022 Dec 14.

DOI:10.1016/j.jsr.2022.12.005

PMID:36868672

Abstract

INTRODUCTION

This study aims to increase the prediction accuracy of crash frequency on roadway segments that can forecast future safety on roadway facilities. A variety of statistical and machine learning (ML) methods are used to model crash frequency with ML methods generally having a higher prediction accuracy. Recently, heterogeneous ensemble methods (HEM), including "stacking," have emerged as more accurate and robust intelligent techniques providing more reliable and accurate predictions.

METHODS

This study applies "Stacking" to model crash frequency on five-lane undivided (5 T) segments of urban and suburban arterials. The prediction performance of "Stacking" is compared with parametric statistical models (Poisson and negative binomial) and three state-of-the-art ML techniques (Decision tree, random forest, and gradient boosting), each of which is termed as the base-learner. By employing an optimal weight scheme to combine individual base-learners through stacking, the problem of biased predictions in individual base-learners due to differences in specifications and prediction accuracies is avoided. Data including crash, traffic, and roadway inventory were collected and integrated from 2013 to 2017. The data are split into training (2013-2015), validation (2016), and testing (2017) datasets. After training five individual base-learners using training data, prediction outcomes are obtained for the five base-learners using validation data that are then used to train a meta-learner.

RESULTS

Results of statistical models reveal that crashes increase with the density (number per mile) of commercial driveways whereas decrease with average offset distance to fixed objects. Individual ML methods show similar results - in terms of variable importance. A comparison of out-of-sample predictions of various models or methods confirms the superiority of "Stacking" over the alternative methods considered.

CONCLUSIONS AND PRACTICAL APPLICATIONS

From a practical standpoint, "stacking" can enhance prediction accuracy (compared to only one base-learner with a particular specification). When applied systemically, stacking can help identify more appropriate countermeasures.

摘要

引言

本研究旨在提高道路路段碰撞频率的预测准确性，从而能够预测道路设施未来的安全性。研究使用了多种统计和机器学习（ML）方法对碰撞频率进行建模，其中ML方法通常具有更高的预测准确性。最近，包括“堆叠”在内的异构集成方法（HEM）已成为更准确、更强大的智能技术，能够提供更可靠、更准确的预测。

方法

本研究应用“堆叠”对城市和郊区干道的五车道无分隔（5T）路段的碰撞频率进行建模。将“堆叠”的预测性能与参数统计模型（泊松和负二项式）以及三种先进的ML技术（决策树、随机森林和梯度提升）进行比较，每种技术都被称为基学习器。通过采用最优权重方案，通过堆叠组合各个基学习器，避免了由于规格和预测准确性差异导致的单个基学习器中预测偏差的问题。收集并整合了2013年至2017年的碰撞、交通和道路库存数据。数据被分为训练集（2013 - 2015年）、验证集（2016年）和测试集（2017年）。使用训练数据训练五个单独的基学习器后，使用验证数据获得五个基学习器的预测结果，然后将这些结果用于训练一个元学习器。

结果

统计模型的结果表明，碰撞次数随着商业车道密度（每英里数量）的增加而增加，而随着与固定物体的平均偏移距离的增加而减少。各个ML方法在变量重要性方面显示出类似的结果。各种模型或方法的样本外预测比较证实了“堆叠”相对于所考虑的替代方法的优越性。

结论与实际应用

从实际角度来看，“堆叠”可以提高预测准确性（与仅具有特定规格的一个基学习器相比）。当系统应用时，堆叠有助于确定更合适的对策。

相似文献

Heterogeneous ensemble learning for enhanced crash forecasts - A frequentist and machine learning based stacking framework.用于增强碰撞预测的异构集成学习——一种基于频率论和机器学习的堆叠框架。

J Safety Res. 2023 Feb;84:418-434. doi: 10.1016/j.jsr.2022.12.005. Epub 2022 Dec 14.

Real-time milk analysis integrated with stacking ensemble learning as a tool for the daily prediction of cheese-making traits in Holstein cattle.将实时牛奶分析与堆叠集成学习相结合，作为预测荷斯坦奶牛奶酪制作特性的日常工具。

J Dairy Sci. 2022 May;105(5):4237-4255. doi: 10.3168/jds.2021-21426. Epub 2022 Mar 10.

Applying machine learning approaches to analyze the vulnerable road-users' crashes at statewide traffic analysis zones.运用机器学习方法分析全州交通分析区弱势道路使用者的碰撞事故。

J Safety Res. 2019 Sep;70:275-288. doi: 10.1016/j.jsr.2019.04.008. Epub 2019 May 10.

A Bayesian optimization tunning integrated multi-stacking classifier framework for the prediction of radiodermatitis from 4D-CT of patients underwent breast cancer radiotherapy.一种用于从接受乳腺癌放疗患者的4D-CT预测放射性皮炎的贝叶斯优化调谐集成多堆叠分类器框架。

Front Oncol. 2023 Jun 13;13:1152020. doi: 10.3389/fonc.2023.1152020. eCollection 2023.

A hybrid super ensemble learning model for the early-stage prediction of diabetes risk.一种用于糖尿病风险早期预测的混合超级集成学习模型。

Med Biol Eng Comput. 2023 Mar;61(3):785-797. doi: 10.1007/s11517-022-02749-z. Epub 2023 Jan 5.

Classification of imbalanced data using machine learning algorithms to predict the risk of renal graft failures in Ethiopia.使用机器学习算法对不平衡数据进行分类，以预测埃塞俄比亚肾移植失败的风险。

BMC Med Inform Decis Mak. 2023 May 22;23(1):98. doi: 10.1186/s12911-023-02185-5.

Novel Prediction Method Applied to Wound Age Estimation: Developing a Stacking Ensemble Model to Improve Predictive Performance Based on Multi-mRNA.应用于伤口年龄估计的新型预测方法：基于多信使核糖核酸开发堆叠集成模型以提高预测性能

Diagnostics (Basel). 2023 Jan 20;13(3):395. doi: 10.3390/diagnostics13030395.

A GA-stacking ensemble approach for forecasting energy consumption in a smart household: A comparative study of ensemble methods.基于 GA 堆叠的智能家居能耗预测集成方法研究：集成方法比较

J Environ Manage. 2024 Jul;364:121264. doi: 10.1016/j.jenvman.2024.121264. Epub 2024 Jun 12.

Injury severity prediction of traffic crashes with ensemble machine learning techniques: a comparative study.基于集成机器学习技术的交通事故损伤严重程度预测：一项对比研究。

Int J Inj Contr Saf Promot. 2021 Dec;28(4):408-427. doi: 10.1080/17457300.2021.1928233. Epub 2021 Jun 1.

Forecasting time trends of fatal motor vehicle crashes in Iran using an ensemble learning algorithm.使用集成学习算法预测伊朗致命机动车事故的时间趋势。

Traffic Inj Prev. 2023;24(1):44-49. doi: 10.1080/15389588.2022.2130279. Epub 2022 Oct 24.

引用本文的文献

Development and validation of a machine learning model for predicting venous thromboembolism complications following colorectal cancer surgery.用于预测结直肠癌手术后静脉血栓栓塞并发症的机器学习模型的开发与验证

Vis Comput Ind Biomed Art. 2025 Sep 12;8(1):22. doi: 10.1186/s42492-025-00204-y.

Crash severity analysis: A data-enhanced double layer stacking model using semantic understanding.碰撞严重程度分析：一种使用语义理解的数据增强双层堆叠模型。

Heliyon. 2024 Apr 29;10(9):e30117. doi: 10.1016/j.heliyon.2024.e30117. eCollection 2024 May 15.

Predicting the incidence of infectious diarrhea with symptom surveillance data using a stacking-based ensembled model.基于堆叠集成模型的症状监测数据预测感染性腹泻发病率。

BMC Infect Dis. 2024 Feb 26;24(1):265. doi: 10.1186/s12879-024-09138-x.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于增强碰撞预测的异构集成学习——一种基于频率论和机器学习的堆叠框架。

Heterogeneous ensemble learning for enhanced crash forecasts - A frequentist and machine learning based stacking framework.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

CONCLUSIONS AND PRACTICAL APPLICATIONS

引言

方法

结果

结论与实际应用

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献