Suppr超能文献

用于数据驱动型应急响应规划的道路交通事故事件的地理解析与分析

Geo-parsing and analysis of road traffic crash incidents for data-driven emergency response planning.

作者信息

Idakwo Patricia Ojonoka, Adekanmbi Olubayo, Soronnadi Anthony, David Amos

机构信息

Department of Computer Science, African University of Science and Technology, Abuja, Nigeria.

Data Science Nigeria, AI Hub, 33 Queens Street, Alagomeji, Yaba, Lagos, Nigeria.

出版信息

Heliyon. 2024 Dec 7;11(4):e41067. doi: 10.1016/j.heliyon.2024.e41067. eCollection 2025 Feb 28.

Abstract

Road traffic crashes (RTCs) are a major public health concern worldwide, particularly in Nigeria, where road transport is the most common mode of transportation. This study presents the geo-parsing approach for geographic information extraction (IE) of RTC incidents from news articles. We developed two custom, spaCy-based, RTC domain-specific named entity recognition (NER) models: RTC NER Baseline and RTC NER. These models were trained on a dataset of Nigerian RTC news articles. Evaluation of the models' performances shows that the RTC NER model outperforms the RTC NER Baseline model on both Nigerian and international test data across all three standard metrics of precision, recall and F1-score. The RTC NER model exhibits precision, recall and F1-score values of 93.63, 93.61 and 93.62, respectively, on the Nigerian test data, and 91.9, 87.88 and 89.84, respectively, on the international test data, thus showing its versatility in IE from RTC reports irrespective of country. We further applied the RTC NER model for feature extraction using geo-parsing techniques to extract RTC location details and retrieve corresponding geographical coordinates, creating a structured Nigeria RTC dataset for exploratory data analysis. Our study showcases the use of the RTC NER model in IE from RTC-related reports for analysis aimed at identifying RTC risk areas for data-driven emergency response planning.

摘要

道路交通事故(RTCs)是全球主要的公共卫生问题,在尼日利亚尤为突出,该国公路运输是最常见的交通方式。本研究提出了一种从新闻文章中提取RTC事件地理信息的地理解析方法。我们开发了两个基于spaCy的自定义RTC领域特定命名实体识别(NER)模型:RTC NER基线模型和RTC NER模型。这些模型在尼日利亚RTC新闻文章数据集上进行了训练。对模型性能的评估表明,在精度、召回率和F1分数这三个标准指标上,RTC NER模型在尼日利亚和国际测试数据上均优于RTC NER基线模型。RTC NER模型在尼日利亚测试数据上的精度、召回率和F1分数分别为93.63、93.61和93.62,在国际测试数据上分别为91.9、87.88和89.84,这表明无论在哪个国家,该模型在从RTC报告中进行信息提取方面都具有通用性。我们进一步应用RTC NER模型,使用地理解析技术进行特征提取,以提取RTC位置细节并检索相应的地理坐标,创建了一个结构化的尼日利亚RTC数据集用于探索性数据分析。我们的研究展示了RTC NER模型在从与RTC相关的报告中进行信息提取的应用,旨在识别RTC风险区域,以进行数据驱动的应急响应规划分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a7c/11876919/a94146ce1b5c/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验