• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过对印度在线媒体文章进行自然语言处理提取的致命道路交通事故属性数据集。

Dataset on fatal road traffic crash attributes extracted via natural language processing of online media articles in India.

作者信息

Ashutosh Ashutosh, Chand Sai

机构信息

Transportation Research and Injury Prevention Centre (TRIP Centre), Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India.

出版信息

Data Brief. 2025 Apr 23;60:111578. doi: 10.1016/j.dib.2025.111578. eCollection 2025 Jun.

DOI:10.1016/j.dib.2025.111578
PMID:40416746
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12098169/
Abstract

Road traffic crashes are among the leading causes of death globally, resulting in substantial social and economic impacts. Online media is a key source of public information on road safety. Understanding how crashes are reported is crucial for detecting potential reporting biases and enhancing safety awareness. Hence, to address the issue of the lack of high-quality, media-reported fatal crash data, fatal crash reports were extracted for 2022-2023 from The Times of India, a prominent Indian news outlet. The resulting dataset comprised 2898 fatal crashes, 6584 fatalities and 7812 injuries, including 16 detailed crash attributes. This dataset was developed using web scraping and natural language processing (NLP) techniques. Automated tools such as Selenium and BeautifulSoup were employed to extract raw data from the news source. NLP algorithms were then applied to identify key crash attributes, including crash date, location, vehicles involved and number of fatalities. This study provides a replicable framework for constructing robust datasets from media sources, enabling multidisciplinary research on transportation safety, media reporting and public perception of crashes. The dataset is expected to serve as a valuable resource for analysing how the media shapes road safety narratives and for investigations on identifying high-fatality crash locations.

摘要

道路交通事故是全球主要死因之一,会造成巨大的社会和经济影响。网络媒体是道路安全公共信息的关键来源。了解事故如何被报道对于发现潜在的报道偏差和提高安全意识至关重要。因此,为了解决缺乏高质量、媒体报道的致命事故数据这一问题,从印度著名新闻媒体《印度时报》中提取了2022 - 2023年的致命事故报告。所得数据集包含2898起致命事故、6584人死亡和7812人受伤,包括16个详细的事故属性。该数据集是使用网络爬虫和自然语言处理(NLP)技术开发的。使用Selenium和BeautifulSoup等自动化工具从新闻源中提取原始数据。然后应用NLP算法来识别关键事故属性,包括事故日期、地点、涉及车辆和死亡人数。本研究提供了一个可复制的框架,用于从媒体来源构建强大的数据集,从而能够对交通安全、媒体报道和公众对事故的认知进行多学科研究。该数据集有望成为分析媒体如何塑造道路安全叙事以及识别高死亡率事故地点调查的宝贵资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/6431bbc5ba26/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/673da64eb25c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/938d6f907958/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/60e029abfea1/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/6d4600269e51/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/db548a68f4e6/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/6431bbc5ba26/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/673da64eb25c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/938d6f907958/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/60e029abfea1/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/6d4600269e51/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/db548a68f4e6/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef32/12098169/6431bbc5ba26/gr6.jpg

相似文献

1
Dataset on fatal road traffic crash attributes extracted via natural language processing of online media articles in India.通过对印度在线媒体文章进行自然语言处理提取的致命道路交通事故属性数据集。
Data Brief. 2025 Apr 23;60:111578. doi: 10.1016/j.dib.2025.111578. eCollection 2025 Jun.
2
ARTCDP: An automated data platform for monitoring emerging patterns concerning road traffic crashes in China.ARTCDP:一个用于监测中国道路交通事故新形态的自动化数据平台。
Accid Anal Prev. 2022 Sep;174:106727. doi: 10.1016/j.aap.2022.106727. Epub 2022 Jun 3.
3
"Crashing the gates" - selection criteria for television news reporting of traffic crashes.“撞开大门”——电视新闻报道交通事故的选择标准。
Accid Anal Prev. 2015 Jul;80:142-52. doi: 10.1016/j.aap.2015.04.010. Epub 2015 Apr 21.
4
Completeness of police reporting of traffic crashes in Nepal: Evaluation using a community crash recording system.尼泊尔警方对交通事故报告的完整性:使用社区事故记录系统进行评估。
Traffic Inj Prev. 2022;23(2):79-84. doi: 10.1080/15389588.2021.2012766. Epub 2022 Jan 14.
5
Speed enforcement detection devices for preventing road traffic injuries.预防道路交通伤害的速度执法检测装置。
Cochrane Database Syst Rev. 2006 Apr 19(2):CD004607. doi: 10.1002/14651858.CD004607.pub2.
6
Characteristics of media-reported road traffic crashes related to new energy vehicles in China.中国媒体报道的与新能源汽车相关的道路交通事故特征。
J Safety Res. 2025 Feb;92:48-54. doi: 10.1016/j.jsr.2024.11.012. Epub 2024 Nov 16.
7
Pooling data from fatality analysis reporting system (FARS) and generalized estimates system (GES) to explore the continuum of injury severity spectrum.整合来自死亡分析报告系统(FARS)和广义估计系统(GES)的数据,以探索损伤严重程度谱的连续性。
Accid Anal Prev. 2015 Nov;84:112-27. doi: 10.1016/j.aap.2015.08.009. Epub 2015 Sep 3.
8
What triggers road traffic fatalities among older adult drivers? An investigation based on the Swedish register for in-depth studies of fatal crashes.是什么引发了老年驾驶员的道路交通死亡事故?一项基于瑞典深入研究致命碰撞事故登记的调查。
Accid Anal Prev. 2023 Sep;190:107149. doi: 10.1016/j.aap.2023.107149. Epub 2023 Jun 24.
9
Factors propelling fatalities during road crashes: A detailed investigation and modelling of historical crash data with field studies.道路交通事故中导致死亡的因素:基于实地研究的历史事故数据详细调查与建模
Heliyon. 2022 Nov 10;8(11):e11531. doi: 10.1016/j.heliyon.2022.e11531. eCollection 2022 Nov.
10
Identifying factors related to pedestrian and cyclist crashes in ACT, Australia with an extended crash dataset.利用扩展后的事故数据集,识别澳大利亚首都领地行人与自行车事故的相关因素。
Accid Anal Prev. 2024 Nov;207:107742. doi: 10.1016/j.aap.2024.107742. Epub 2024 Aug 12.

本文引用的文献

1
Crash and disengagement data of autonomous vehicles on public roads in California.加利福尼亚州公共道路上自动驾驶汽车的碰撞和脱离数据。
Sci Data. 2021 Nov 23;8(1):298. doi: 10.1038/s41597-021-01083-7.
2
Estimating under-reporting of road crash injuries to police using multiple linked data collections.利用多个关联数据集估计向警方报告的道路交通事故伤害漏报情况。
Accid Anal Prev. 2015 Oct;83:18-25. doi: 10.1016/j.aap.2015.06.011. Epub 2015 Jul 8.
3
"Crashing the gates" - selection criteria for television news reporting of traffic crashes.
“撞开大门”——电视新闻报道交通事故的选择标准。
Accid Anal Prev. 2015 Jul;80:142-52. doi: 10.1016/j.aap.2015.04.010. Epub 2015 Apr 21.
4
Television news' coverage of motor-vehicle crashes.
J Safety Res. 2008;39(5):547-53. doi: 10.1016/j.jsr.2008.09.002. Epub 2008 Oct 9.