一种用于分析新冠疫情期间推特网络欺凌行为流行率的自然语言处理辅助贝叶斯时间序列分析。

An NLP-assisted Bayesian time-series analysis for prevalence of Twitter cyberbullying during the COVID-19 pandemic.

作者信息

Perez Christopher, Karmakar Sayar

机构信息

Department of Statistics, University of Florida, Gainesville, FL 32601 USA.

出版信息

Soc Netw Anal Min. 2023;13(1):51. doi: 10.1007/s13278-023-01053-4. Epub 2023 Mar 15.

DOI:10.1007/s13278-023-01053-4

PMID:36937491

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10016178/

Abstract

COVID-19 has brought about many changes in social dynamics. Stay-at-home orders and disruptions in school teaching can influence bullying behavior in-person and online, both of which leading to negative outcomes in victims. To study cyberbullying specifically, 1 million tweets containing keywords associated with abuse were collected from the beginning of 2019 to the end of 2021 with the Twitter API search endpoint. A natural language processing model pre-trained on a Twitter corpus generated probabilities for the tweets being offensive and hateful. To overcome limitations of sampling, data were also collected using the count endpoint. The fraction of tweets from a given daily sample marked as abusive is multiplied to the number reported by the count endpoint. Once these adjusted counts are assembled, a Bayesian autoregressive Poisson model allows one to study the mean trend and lag functions of the data and how they vary over time. The results reveal strong weekly and yearly seasonality in hateful speech but with slight differences across years that may be attributed to COVID-19.

摘要

新冠疫情给社会动态带来了诸多变化。居家令和学校教学中断会影响线下和线上的欺凌行为，这两种情况都会给受害者带来负面后果。为了专门研究网络欺凌，从2019年初到2021年底，通过推特应用程序编程接口搜索端点收集了100万条包含与虐待相关关键词的推文。一个在推特语料库上预训练的自然语言处理模型生成了这些推文具有攻击性和仇恨性的概率。为了克服抽样的局限性，还使用计数端点收集了数据。将给定每日样本中标记为辱骂性的推文比例乘以计数端点报告的数量。一旦这些调整后的计数汇总起来，贝叶斯自回归泊松模型就能让人研究数据的平均趋势和滞后函数，以及它们如何随时间变化。结果显示，仇恨言论存在强烈的每周和每年季节性，但不同年份略有差异，这可能归因于新冠疫情。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种用于分析新冠疫情期间推特网络欺凌行为流行率的自然语言处理辅助贝叶斯时间序列分析。

An NLP-assisted Bayesian time-series analysis for prevalence of Twitter cyberbullying during the COVID-19 pandemic.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

一种用于分析新冠疫情期间推特网络欺凌行为流行率的自然语言处理辅助贝叶斯时间序列分析。

An NLP-assisted Bayesian time-series analysis for prevalence of Twitter cyberbullying during the COVID-19 pandemic.

作者信息

机构信息

出版信息

相似文献

本文引用的文献