Wu ChienHsing, Kao Shu-Chen, Chang Chia-Chen
National University of Kaohsiung, Kaohsiung, Taiwan.
Kun Shan University, Tainan, Taiwan.
Heliyon. 2022 Sep;8(9):e10302. doi: 10.1016/j.heliyon.2022.e10302. Epub 2022 Aug 24.
Extracting knowledge from open data of traffic accidents has been attracting increasing attention to policymakers responsible for road safety. This article presents a knowledge elicitation approach to exploring the determinants of traffic accidents from open government data of an urban area in Taiwan. The collected open dataset contains 34 decisional attributes and one predictive attribute (i.e., type of injury, including head, breast, leg), and 47,974 cases. Prediction models using a classification-oriented mechanism and generated rules that considered datasets from before (; 30,116 cases) and after (; 17,868 cases) beginning to combat the Covid-19 pandemic in an urban area of Taiwan were compared. The findings showed that prediction accuracy was acceptable but not high, at 70.73% for and 74.77% for . Determinants in the human and vehicle categories revealed higher classification ranks than those in the temporal and environment categories. Traffic accidents involving motorcycles were 5.13% higher in , whereas those involving cars were 4.11% lower. Injury on leg or foot was 3.46% higher in , whereas other types of injury were up to 1.00% lower. The average support for rules in the rule base and the simplicity of the decision tree were higher than those of . The research demonstrates the value of open government data in prediction model development and knowledge elicitation to support policymaking in the traffic safety domain.
从交通事故开放数据中提取知识,已日益引起负责道路安全的政策制定者的关注。本文提出了一种知识提取方法,用于从台湾某城市地区的政府开放数据中探索交通事故的决定因素。所收集的开放数据集包含34个决策属性和一个预测属性(即伤害类型,包括头部、胸部、腿部),以及47974个案例。比较了使用面向分类机制的预测模型以及生成的规则,这些规则考虑了台湾某城市地区在开始抗击新冠疫情之前(30116个案例)和之后(17868个案例)的数据集。研究结果表明,预测准确率尚可但不算高,[具体情况1]为70.73%,[具体情况2]为74.77%。人与车辆类别的决定因素显示出比时间和环境类别的决定因素更高的分类等级。[具体情况1]中涉及摩托车的交通事故高出5.13%,而涉及汽车的交通事故则低4.11%。[具体情况1]中腿部或脚部受伤高出3.46%,而其他类型的受伤则低至1.00%。[具体情况1]规则库中规则的平均支持度以及[具体情况1]决策树的简单性高于[具体情况2]。该研究证明了政府开放数据在预测模型开发和知识提取中的价值,以支持交通安全领域的政策制定。