• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

检测阿拉伯地区与 COVID-19 相关推文的仇恨言论:深度学习和主题建模方法。

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.

机构信息

King Saud University, Riyadh, Saudi Arabia.

出版信息

J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609.

DOI:10.2196/22609
PMID:33207310
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7725497/
Abstract

BACKGROUND

The massive scale of social media platforms requires an automatic solution for detecting hate speech. These automatic solutions will help reduce the need for manual analysis of content. Most previous literature has cast the hate speech detection problem as a supervised text classification task using classical machine learning methods or, more recently, deep learning methods. However, work investigating this problem in Arabic cyberspace is still limited compared to the published work on English text.

OBJECTIVE

This study aims to identify hate speech related to the COVID-19 pandemic posted by Twitter users in the Arab region and to discover the main issues discussed in tweets containing hate speech.

METHODS

We used the ArCOV-19 dataset, an ongoing collection of Arabic tweets related to COVID-19, starting from January 27, 2020. Tweets were analyzed for hate speech using a pretrained convolutional neural network (CNN) model; each tweet was given a score between 0 and 1, with 1 being the most hateful text. We also used nonnegative matrix factorization to discover the main issues and topics discussed in hate tweets.

RESULTS

The analysis of hate speech in Twitter data in the Arab region identified that the number of non-hate tweets greatly exceeded the number of hate tweets, where the percentage of hate tweets among COVID-19 related tweets was 3.2% (11,743/547,554). The analysis also revealed that the majority of hate tweets (8385/11,743, 71.4%) contained a low level of hate based on the score provided by the CNN. This study identified Saudi Arabia as the Arab country from which the most COVID-19 hate tweets originated during the pandemic. Furthermore, we showed that the largest number of hate tweets appeared during the time period of March 1-30, 2020, representing 51.9% of all hate tweets (6095/11,743). Contrary to what was anticipated, in the Arab region, it was found that the spread of COVID-19-related hate speech on Twitter was weakly related with the dissemination of the pandemic based on the Pearson correlation coefficient (r=0.1982, P=.50). The study also identified the commonly discussed topics in hate tweets during the pandemic. Analysis of the 7 extracted topics showed that 6 of the 7 identified topics were related to hate speech against China and Iran. Arab users also discussed topics related to political conflicts in the Arab region during the COVID-19 pandemic.

CONCLUSIONS

The COVID-19 pandemic poses serious public health challenges to nations worldwide. During the COVID-19 pandemic, frequent use of social media can contribute to the spread of hate speech. Hate speech on the web can have a negative impact on society, and hate speech may have a direct correlation with real hate crimes, which increases the threat associated with being targeted by hate speech and abusive language. This study is the first to analyze hate speech in the context of Arabic COVID-19-related tweets in the Arab region.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/1c5db5f693e0/jmir_v22i12e22609_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/036bac3ec696/jmir_v22i12e22609_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/ba5d8697578d/jmir_v22i12e22609_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/5a2e5d2eb8e0/jmir_v22i12e22609_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/1c5db5f693e0/jmir_v22i12e22609_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/036bac3ec696/jmir_v22i12e22609_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/ba5d8697578d/jmir_v22i12e22609_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/5a2e5d2eb8e0/jmir_v22i12e22609_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1746/7725497/1c5db5f693e0/jmir_v22i12e22609_fig4.jpg
摘要

背景

社交媒体平台的大规模需要自动解决方案来检测仇恨言论。这些自动解决方案将有助于减少对内容的手动分析的需求。大多数先前的文献将仇恨言论检测问题视为使用经典机器学习方法或最近的深度学习方法的监督文本分类任务。然而,与已发表的关于英语文本的工作相比,在阿拉伯网络空间调查这一问题的工作仍然有限。

目的

本研究旨在识别与阿拉伯地区推特用户发布的与 COVID-19 相关的仇恨言论,并发现仇恨言论中讨论的主要问题。

方法

我们使用了 ArCOV-19 数据集,这是一个从 2020 年 1 月 27 日开始收集的与 COVID-19 相关的阿拉伯推文的持续数据集。使用预先训练的卷积神经网络(CNN)模型对推文进行仇恨言论分析;每个推文的得分在 0 到 1 之间,1 表示最具仇恨性的文本。我们还使用非负矩阵分解来发现仇恨推文中讨论的主要问题和主题。

结果

对阿拉伯地区推特数据中的仇恨言论分析表明,非仇恨推文的数量大大超过仇恨推文的数量,在与 COVID-19 相关的推文中,仇恨推文的比例为 3.2%(11743/547554)。分析还显示,大多数仇恨推文(8385/11743,71.4%)基于 CNN 提供的分数,其仇恨程度较低。本研究确定沙特阿拉伯是阿拉伯国家中 COVID-19 仇恨推文数量最多的国家。此外,我们表明,在 2020 年 3 月 1 日至 30 日期间,出现了最多的仇恨推文,占所有仇恨推文的 51.9%(6095/11743)。与预期相反,在阿拉伯地区,根据 Pearson 相关系数(r=0.1982,P=.50),发现推特上与 COVID-19 相关的仇恨言论传播与大流行的传播关系较弱。该研究还确定了大流行期间仇恨推文中讨论的常见主题。对提取的 7 个主题的分析表明,7 个已识别主题中有 6 个与针对中国和伊朗的仇恨言论有关。阿拉伯用户还讨论了 COVID-19 期间阿拉伯地区政治冲突的相关主题。

结论

COVID-19 大流行对世界各国的公共卫生构成了严重挑战。在 COVID-19 大流行期间,频繁使用社交媒体可能会导致仇恨言论的传播。网络仇恨言论会对社会产生负面影响,仇恨言论可能与真实的仇恨犯罪直接相关,这增加了仇恨言论和辱骂性语言的目标受到攻击的威胁。本研究是首次分析阿拉伯地区与阿拉伯 COVID-19 相关的推文的背景下的仇恨言论。

相似文献

1
Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.检测阿拉伯地区与 COVID-19 相关推文的仇恨言论:深度学习和主题建模方法。
J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609.
2
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.
3
Temporal and Location Variations, and Link Categories for the Dissemination of COVID-19-Related Information on Twitter During the SARS-CoV-2 Outbreak in Europe: Infoveillance Study.欧洲SARS-CoV-2疫情期间推特上新冠疫情相关信息传播的时间和地点变化以及链接类别:信息监测研究
J Med Internet Res. 2020 Aug 28;22(8):e19629. doi: 10.2196/19629.
4
Topics, Trends, and Sentiments of Tweets About the COVID-19 Pandemic: Temporal Infoveillance Study.关于新冠疫情的推文主题、趋势和情绪:时间信息监测研究
J Med Internet Res. 2020 Oct 23;22(10):e22624. doi: 10.2196/22624.
5
Twitter Discussions and Emotions About the COVID-19 Pandemic: Machine Learning Approach.关于新冠疫情的推特讨论与情绪:机器学习方法
J Med Internet Res. 2020 Nov 25;22(11):e20550. doi: 10.2196/20550.
6
Emotions and Topics Expressed on Twitter During the COVID-19 Pandemic in the United Kingdom: Comparative Geolocation and Text Mining Analysis.在英国 COVID-19 大流行期间在 Twitter 上表达的情绪和主题:比较地理定位和文本挖掘分析。
J Med Internet Res. 2022 Oct 5;24(10):e40323. doi: 10.2196/40323.
7
Concerns Expressed by Chinese Social Media Users During the COVID-19 Pandemic: Content Analysis of Sina Weibo Microblogging Data.新冠疫情期间中国社交媒体用户表达的担忧:对新浪微博数据的内容分析
J Med Internet Res. 2020 Nov 26;22(11):e22152. doi: 10.2196/22152.
8
COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data.新冠疫情与5G阴谋论:基于推特数据的社交网络分析
J Med Internet Res. 2020 May 6;22(5):e19458. doi: 10.2196/19458.
9
The Saudi Ministry of Health's Twitter Communication Strategies and Public Engagement During the COVID-19 Pandemic: Content Analysis Study.沙特卫生部在 COVID-19 大流行期间的 Twitter 传播策略和公众参与:内容分析研究。
JMIR Public Health Surveill. 2021 Jul 12;7(7):e27942. doi: 10.2196/27942.
10
Emergency Physician Twitter Use in the COVID-19 Pandemic as a Potential Predictor of Impending Surge: Retrospective Observational Study.《COVID-19 大流行期间急诊医师在 Twitter 上的使用情况可能预示着即将出现的疫情高峰:回顾性观察研究》
J Med Internet Res. 2021 Jul 14;23(7):e28615. doi: 10.2196/28615.

引用本文的文献

1
Natural Language Processing Technologies for Public Health in Africa: Scoping Review.非洲公共卫生领域的自然语言处理技术:范围综述
J Med Internet Res. 2025 Mar 5;27:e68720. doi: 10.2196/68720.
2
A Systematic Review of the Outcomes of Utilization of Artificial Intelligence Within the Healthcare Systems of the Middle East: A Thematic Analysis of Findings.中东医疗系统中人工智能应用成果的系统评价:研究结果的主题分析
Health Sci Rep. 2024 Dec 24;7(12):e70300. doi: 10.1002/hsr2.70300. eCollection 2024 Dec.
3
Toxicity on Social Media During the 2022 Mpox Public Health Emergency: Quantitative Study of Topical and Network Dynamics.

本文引用的文献

1
Online Information Exchange and Anxiety Spread in the Early Stage of the Novel Coronavirus (COVID-19) Outbreak in South Korea: Structural Topic Model and Network Analysis.韩国新型冠状病毒(COVID-19)疫情早期的在线信息交流与焦虑传播:结构主题模型与网络分析
J Med Internet Res. 2020 Jun 2;22(6):e19455. doi: 10.2196/19455.
2
COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data.新冠疫情与5G阴谋论:基于推特数据的社交网络分析
J Med Internet Res. 2020 May 6;22(5):e19458. doi: 10.2196/19458.
3
Creating COVID-19 Stigma by Referencing the Novel Coronavirus as the "Chinese virus" on Twitter: Quantitative Analysis of Social Media Data.
2022年猴痘公共卫生紧急事件期间社交媒体上的毒性:局部和网络动态的定量研究
J Med Internet Res. 2024 Dec 12;26:e52997. doi: 10.2196/52997.
4
Asian hate speech detection on Twitter during COVID-19.新冠疫情期间推特上的反亚裔仇恨言论检测
Front Artif Intell. 2022 Aug 15;5:932381. doi: 10.3389/frai.2022.932381. eCollection 2022.
5
Online information analysis on pancreatic cancer in Korea using structural topic model.利用结构主题模型对韩国胰腺癌的在线信息进行分析。
Sci Rep. 2022 Jun 23;12(1):10622. doi: 10.1038/s41598-022-14506-1.
6
An Analysis of French-Language Tweets About COVID-19 Vaccines: Supervised Learning Approach.关于新冠疫苗的法语推文分析:监督学习方法
JMIR Med Inform. 2022 May 17;10(5):e37831. doi: 10.2196/37831.
7
Making sense of COVID-19 over time in New Zealand: Assessing the public conversation using Twitter.随着时间的推移对新西兰 COVID-19 疫情的认知:利用 Twitter 评估公众对话。
PLoS One. 2021 Dec 15;16(12):e0259882. doi: 10.1371/journal.pone.0259882. eCollection 2021.
8
A high-resolution temporal and geospatial content analysis of Twitter posts related to the COVID-19 pandemic.对与新冠疫情相关的推特帖子进行的高分辨率时空内容分析。
J Comput Soc Sci. 2022;5(1):687-729. doi: 10.1007/s42001-021-00150-8. Epub 2021 Oct 20.
在推特上将新型冠状病毒称为“中国病毒”从而制造新冠病毒污名化:社交媒体数据的定量分析
J Med Internet Res. 2020 May 6;22(5):e19301. doi: 10.2196/19301.
4
Impact of Online Information on Self-Isolation Intention During the COVID-19 Pandemic: Cross-Sectional Study.新冠疫情期间在线信息对自我隔离意愿的影响:横断面研究
J Med Internet Res. 2020 May 6;22(5):e19128. doi: 10.2196/19128.
5
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.
6
Social media and outbreaks of emerging infectious diseases: A systematic review of literature.社交媒体与新发传染病疫情:文献系统综述。
Am J Infect Control. 2018 Sep;46(9):962-972. doi: 10.1016/j.ajic.2018.02.010. Epub 2018 Apr 5.
7
Using online social networks to track a pandemic: A systematic review.利用在线社交网络追踪大流行病:一项系统综述。
J Biomed Inform. 2016 Aug;62:1-11. doi: 10.1016/j.jbi.2016.05.005. Epub 2016 May 17.
8
Automatically quantifying the scientific quality and sensationalism of news records mentioning pandemics: validating a maximum entropy machine-learning model.自动量化提及大流行病的新闻记录的科学质量和轰动效应:验证最大熵机器学习模型
J Clin Epidemiol. 2016 Jul;75:47-55. doi: 10.1016/j.jclinepi.2015.12.010. Epub 2016 Mar 7.
9
Learning the parts of objects by non-negative matrix factorization.通过非负矩阵分解学习物体的各个部分。
Nature. 1999 Oct 21;401(6755):788-91. doi: 10.1038/44565.