Chang Li-Yen, Chen Wen-Chieh
Graduate Institute of Transportation and Logistics, National Chia-Yi University, Taiwan.
J Safety Res. 2005;36(4):365-75. doi: 10.1016/j.jsr.2005.06.013.
Statistical models, such as Poisson or negative binomial regression models, have been employed to analyze vehicle accident frequency for many years. However, these models have their own model assumptions and pre-defined underlying relationship between dependent and independent variables. If these assumptions are violated, the model could lead to erroneous estimation of accident likelihood. Classification and Regression Tree (CART), one of the most widely applied data mining techniques, has been commonly employed in business administration, industry, and engineering. CART does not require any pre-defined underlying relationship between target (dependent) variable and predictors (independent variables) and has been shown to be a powerful tool, particularly for dealing with prediction and classification problems.
This study collected the 2001-2002 accident data of National Freeway 1 in Taiwan. A CART model and a negative binomial regression model were developed to establish the empirical relationship between traffic accidents and highway geometric variables, traffic characteristics, and environmental factors.
The CART findings indicated that the average daily traffic volume and precipitation variables were the key determinants for freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.
By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.
多年来,泊松或负二项式回归模型等统计模型一直被用于分析车辆事故频率。然而,这些模型有其自身的模型假设以及因变量和自变量之间预先定义的潜在关系。如果这些假设被违反,模型可能会导致对事故可能性的错误估计。分类与回归树(CART)作为应用最广泛的数据挖掘技术之一,已在工商管理、工业和工程领域中普遍使用。CART不需要目标(因)变量和预测变量(自变量)之间有任何预先定义的潜在关系,并且已被证明是一个强大的工具,尤其适用于处理预测和分类问题。
本研究收集了台湾国道1号2001 - 2002年的事故数据。建立了一个CART模型和一个负二项式回归模型,以确定交通事故与公路几何变量、交通特征和环境因素之间的经验关系。
CART分析结果表明,日均交通量和降水量变量是高速公路事故频率的关键决定因素。通过比较CART模型和负二项式回归模型的预测性能,本研究表明CART是分析高速公路事故频率的一种很好的替代方法。
通过比较CART模型和负二项式回归模型的预测性能,本研究表明CART是分析高速公路事故频率的一种很好的替代方法。