Taamneh Madhar, Taamneh Salah, Alkheder Sharaf
a Department of Civil Engineering, Hijjawi Faculty for Engineering Technology , Yarmouk University , Irbid , Jordan.
b Department of Computer Science, College of Natural Sciences and Mathematics , University of Houston , TX , USA.
Int J Inj Contr Saf Promot. 2017 Sep;24(3):388-395. doi: 10.1080/17457300.2016.1224902. Epub 2016 Sep 8.
Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.
人工神经网络(ANNs)已被广泛应用于预测道路交通事故的严重程度。关于先前发生事故的所有可用信息通常用于构建单个预测模型(即分类器)。人们很少关注这些事故之间的差异,在大多数情况下,这导致构建的预测器准确性较低。层次聚类是一种著名的聚类方法,旨在通过创建聚类层次结构来对数据进行分组。利用层次聚类和人工神经网络,提出了一种基于聚类的道路交通事故伤害严重程度预测分类方法。本研究使用了2008年至2013年在阿布扎比六年期间发生的约6000起道路事故。为了减少数据中的变化量,对数据集应用层次聚类,将其组织成六种不同形式,每种形式具有不同数量的聚类(即从1到6个聚类)。随后为每种生成形式的事故聚类构建了两个人工神经网络模型。第一个模型使用所有事故(训练集)构建并验证,而第二个模型仅使用66%的事故构建,其余34%用于测试(百分比分割)。最后,计算每种数据形式中每种模型类型的加权平均准确率。结果表明,在使用训练集测试模型时,分类前进行聚类比不使用聚类的准确率高(11%-16%),而百分比分割的准确率高(2%-5%)。结果还表明,如果考虑两种模型类型,将事故分为六个聚类可获得最佳准确率。