• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Iktishaf+:一种大数据工具,具有自动标记功能,用于使用分布式机器学习进行道路交通社会感知和事件检测。

Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning.

机构信息

Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia.

School of Architecture and Built Environment, Queensland University of Technology, 2 George Street, Brisbane 4000, QLD, Australia.

出版信息

Sensors (Basel). 2021 Apr 24;21(9):2993. doi: 10.3390/s21092993.

DOI:10.3390/s21092993
PMID:33923247
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8123223/
Abstract

Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.

摘要

数字社会的特点是人们越来越渴望表达自我并与他人互动。这一目标正在通过社交媒体等数字平台实现,与智能社会中许多领域的物理传感器相比,社交媒体平台已成为更加便捷和廉价的传感器。道路交通就是这样一个主要领域,它是现代经济的支柱,每年在全球范围内造成 125 万人死亡和 5000 万人受伤。基于大数据的社交媒体分析在交通相关研究中的应用还处于前沿阶段。本文将一系列技术结合起来,使用大数据和分布式机器学习来检测与道路交通相关的事件。这项研究的最独特贡献是提出了一种基于机器学习的交通相关事件检测的自动标记方法,可从阿拉伯语的 Twitter 数据中检测。该方法已在名为 Iktishaf+(阿拉伯语,意为发现)的软件工具中实现,该工具能够使用分布式机器学习在 Apache Spark 上自动从阿拉伯语的推文中检测交通事件。该工具由九个组件和一系列技术构建而成,包括 Apache Spark、Parquet 和 MongoDB。Iktishaf+使用我们开发的轻量级阿拉伯语词干提取器。我们还在这项工作中使用了我们开发的位置提取器,该提取器允许我们提取和可视化检测到的事件的时空信息。这项工作中使用的特定数据包括从沙特阿拉伯使用 Twitter API 收集的 3350 万条推文。我们使用支持向量机、朴素贝叶斯和逻辑回归分类器来检测和验证沙特阿拉伯的几个真实事件,而无需事先了解这些事件,包括吉达的火灾、麦加的降雨和利雅得的事故。研究结果表明,Twitter 媒体在检测重要事件方面具有有效性,而无需事先了解这些事件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/106d17103838/sensors-21-02993-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/e179e4d304fe/sensors-21-02993-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/4d2e6e6cc9ed/sensors-21-02993-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/2d5e019900b2/sensors-21-02993-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/028d638482dc/sensors-21-02993-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/3e3d305eae4e/sensors-21-02993-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/28f500cc7018/sensors-21-02993-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/baae3bd56dc8/sensors-21-02993-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/418ee73207a8/sensors-21-02993-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/509348101040/sensors-21-02993-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/b9aacfdc75d3/sensors-21-02993-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/3038a0d8e734/sensors-21-02993-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/5ccb55f02f79/sensors-21-02993-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/f04e817b1eeb/sensors-21-02993-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/46dfee941217/sensors-21-02993-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/42cb73de9ae4/sensors-21-02993-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/106d17103838/sensors-21-02993-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/e179e4d304fe/sensors-21-02993-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/4d2e6e6cc9ed/sensors-21-02993-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/2d5e019900b2/sensors-21-02993-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/028d638482dc/sensors-21-02993-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/3e3d305eae4e/sensors-21-02993-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/28f500cc7018/sensors-21-02993-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/baae3bd56dc8/sensors-21-02993-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/418ee73207a8/sensors-21-02993-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/509348101040/sensors-21-02993-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/b9aacfdc75d3/sensors-21-02993-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/3038a0d8e734/sensors-21-02993-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/5ccb55f02f79/sensors-21-02993-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/f04e817b1eeb/sensors-21-02993-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/46dfee941217/sensors-21-02993-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/42cb73de9ae4/sensors-21-02993-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1cb/8123223/106d17103838/sensors-21-02993-g016.jpg

相似文献

1
Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning.Iktishaf+:一种大数据工具,具有自动标记功能,用于使用分布式机器学习进行道路交通社会感知和事件检测。
Sensors (Basel). 2021 Apr 24;21(9):2993. doi: 10.3390/s21092993.
2
COVID-19: Detecting Government Pandemic Measures and Public Concerns from Twitter Arabic Data Using Distributed Machine Learning.COVID-19:利用分布式机器学习从推特阿拉伯语数据中检测政府大流行病措施和公众关切。
Int J Environ Res Public Health. 2021 Jan 1;18(1):282. doi: 10.3390/ijerph18010282.
3
A machine learning-based approach for sentiment analysis on distance learning from Arabic Tweets.一种基于机器学习的方法用于对阿拉伯语推文的远程学习进行情感分析。
PeerJ Comput Sci. 2022 Jul 26;8:e1047. doi: 10.7717/peerj-cs.1047. eCollection 2022.
4
Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs.利用大数据、内存计算、深度学习和图形处理器实现更智能的交通预测。
Sensors (Basel). 2019 May 13;19(9):2206. doi: 10.3390/s19092206.
5
Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study.用于在阿拉伯社交媒体上检测准确健康信息的预训练Transformer语言模型与预训练词嵌入:比较研究
JMIR Form Res. 2022 Jun 29;6(6):e34834. doi: 10.2196/34834.
6
Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.检测阿拉伯地区与 COVID-19 相关推文的仇恨言论:深度学习和主题建模方法。
J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609.
7
A comprehensive social media data processing and analytics architecture by using big data platforms: a case study of twitter flood-risk messages.一种使用大数据平台的综合社交媒体数据处理与分析架构:以推特洪水风险信息为例
Earth Sci Inform. 2021;14(2):913-929. doi: 10.1007/s12145-021-00601-w. Epub 2021 Mar 11.
8
How Do You #relax When You're #stressed? A Content Analysis and Infodemiology Study of Stress-Related Tweets.当你感到压力时如何放松?一项关于与压力相关推文的内容分析和信息流行病学研究。
JMIR Public Health Surveill. 2017 Jun 13;3(2):e35. doi: 10.2196/publichealth.5939.
9
Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study.基于机器学习的方法在推特上检测与 COVID-19 相关的自我报告症状、检测途径和康复情况:回顾性大数据信息监测研究。
JMIR Public Health Surveill. 2020 Jun 8;6(2):e19509. doi: 10.2196/19509.
10
Solution to Detect, Classify, and Report Illicit Online Marketing and Sales of Controlled Substances via Twitter: Using Machine Learning and Web Forensics to Combat Digital Opioid Access.通过推特检测、分类和报告受控物质的非法在线营销与销售的解决方案:利用机器学习和网络取证打击数字阿片类药物获取途径
J Med Internet Res. 2018 Apr 27;20(4):e10029. doi: 10.2196/10029.

引用本文的文献

1
Multi-generational labour markets: Data-driven discovery of multi-perspective system parameters using machine learning.多代劳动力市场:利用机器学习通过数据驱动发现多视角系统参数
Sci Prog. 2023 Oct-Dec;106(4):368504231213788. doi: 10.1177/00368504231213788.
2
Psychological Health and Drugs: Data-Driven Discovery of Causes, Treatments, Effects, and Abuses.心理健康与药物:基于数据驱动探索成因、治疗方法、影响及滥用情况
Toxics. 2023 Mar 20;11(3):287. doi: 10.3390/toxics11030287.
3
Developing Smartness in Emerging Environments and Applications with a Focus on the Internet of Things.

本文引用的文献

1
COVID-19: Detecting Government Pandemic Measures and Public Concerns from Twitter Arabic Data Using Distributed Machine Learning.COVID-19:利用分布式机器学习从推特阿拉伯语数据中检测政府大流行病措施和公众关切。
Int J Environ Res Public Health. 2021 Jan 1;18(1):282. doi: 10.3390/ijerph18010282.
2
How can social media analytics assist authorities in pandemic-related policy decisions? Insights from Australian states and territories.社交媒体分析如何协助当局做出与疫情相关的政策决策?来自澳大利亚各州和领地的见解。
Health Inf Sci Syst. 2020 Oct 15;8(1):37. doi: 10.1007/s13755-020-00121-9. eCollection 2020 Dec.
3
Distributed Artificial Intelligence-as-a-Service (DAIaaS) for Smarter IoE and 6G Environments.
在新兴环境和应用中开发智能,重点关注物联网。
Sensors (Basel). 2022 Nov 18;22(22):8939. doi: 10.3390/s22228939.
4
Analysis of the implementation of urban computing in smart cities: A framework for the transformation of Saudi cities.智慧城市中城市计算实施情况分析:沙特城市转型框架
Heliyon. 2022 Oct 18;8(10):e11138. doi: 10.1016/j.heliyon.2022.e11138. eCollection 2022 Oct.
5
LidSonic V2.0: A LiDAR and Deep-Learning-Based Green Assistive Edge Device to Enhance Mobility for the Visually Impaired.LidSonic V2.0:一种基于激光雷达和深度学习的绿色辅助边缘设备,可增强视障人士的移动能力。
Sensors (Basel). 2022 Sep 30;22(19):7435. doi: 10.3390/s22197435.
6
Imtidad: A Reference Architecture and a Case Study on Developing Distributed AI Services for Skin Disease Diagnosis over Cloud, Fog and Edge.Imtidad:用于在云、雾和边缘上开发皮肤病诊断分布式 AI 服务的参考架构和案例研究。
Sensors (Basel). 2022 Feb 26;22(5):1854. doi: 10.3390/s22051854.
分布式人工智能即服务 (DAIaaS) 助力更智能的物联网和 6G 环境。
Sensors (Basel). 2020 Oct 13;20(20):5796. doi: 10.3390/s20205796.
4
Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs.利用大数据、内存计算、深度学习和图形处理器实现更智能的交通预测。
Sensors (Basel). 2019 May 13;19(9):2206. doi: 10.3390/s19092206.
5
Traffic Congestion Detection System through Connected Vehicles and Big Data.基于联网车辆和大数据的交通拥堵检测系统
Sensors (Basel). 2016 Apr 28;16(5):599. doi: 10.3390/s16050599.
6
Use of the 'ex vivo' test to study long-term bacterial survival on human skin and their sensitivity to antisepsis.使用“体外”试验研究细菌在人体皮肤上的长期存活情况及其对抗菌处理的敏感性。
J Appl Microbiol. 2004;97(6):1149-60. doi: 10.1111/j.1365-2672.2004.02403.x.