Department of Civil, Environmental and Construction Engineering, University of Central Florida, 4000 Central Florida Blvd., Orlando, FL 32816-2450, USA.
J Safety Res. 2009;40(4):317-27. doi: 10.1016/j.jsr.2009.05.003. Epub 2009 Aug 5.
The study aims at identifying traffic/highway design/driver-vehicle information significantly related with fatal/severe crashes on urban arterials for different crash types. Since the data used in this study are observational (i.e., collected outside the purview of a designed experiment), an information discovery approach is adopted for this study.
Random Forests, which are ensembles of individual trees grown by CART (Classification and Regression Tree) algorithm, are applied in numerous applications for this purpose. Specifically, conditional inference forests have been implemented. In each tree of the conditional inference forest, splits are based on how good the association is. Chi-square test statistics are used to measure the association. Apart from identifying the variables that improve classification accuracy, the methodology also clearly identifies the variables that are neutral to accuracy, and also those that decrease it.
The methodology is quite insightful in identifying the variables of interest in the database (e.g., alcohol/ drug use and higher posted speed limits contribute to severe crashes). Failure to use safety equipment by all passengers and presence of driver/passenger in the vulnerable age group (more than 55 years or less than 3 years) increased the severity of injuries given a crash had occurred. A new variable, 'element' has been used in this study, which assigns crashes to segments, intersections, or access points based on the information from site location, traffic control, and presence of signals.
The authors were able to identify roadway locations where severe crashes tend to occur. For example, segments and access points were found to be riskier for single vehicle crashes. Higher skid resistance and k-factor also contributed toward increased severity of injuries in crashes.
本研究旨在确定与城市干道不同碰撞类型的致命/严重碰撞显著相关的交通/高速公路设计/驾驶员-车辆信息。由于本研究中使用的数据是观测性的(即在设计实验的范围之外收集的),因此采用了信息发现方法。
随机森林(Random Forests)是一种由 CART(分类和回归树)算法生成的个体树的集合,它在许多应用中都被用于此目的。具体来说,已经实现了条件推断森林。在条件推断森林的每棵树中,分裂都是基于关联的好坏程度。卡方检验统计量用于测量关联。除了确定可以提高分类准确性的变量外,该方法还清楚地确定了对准确性中立的变量,以及降低准确性的变量。
该方法在识别数据库中感兴趣的变量方面非常有见地(例如,酒精/药物使用和较高的限速标志导致严重碰撞)。如果所有乘客都没有使用安全设备,并且驾驶员/乘客处于易受伤害的年龄组(超过 55 岁或小于 3 岁),则即使发生碰撞,受伤的严重程度也会增加。在本研究中使用了一个新变量“元素”,根据现场位置、交通控制和信号存在的信息,将碰撞分配给路段、交叉口或接入点。
作者能够确定严重碰撞倾向发生的道路位置。例如,发现路段和接入点对单车碰撞的风险更高。更高的抗滑阻力和 k 因子也导致碰撞中受伤的严重程度增加。