Factor-Inwentash Faculty of Social Work, University of Toronto, Toronto, ON, Canada.
Faculty of Information, University of Toronto, Toronto, ON, Canada.
J Med Internet Res. 2023 May 15;25:e46084. doi: 10.2196/46084.
Scholars have used data from in-person interviews, administrative systems, and surveys for sexual violence research. Using Twitter as a data source for examining the nature of sexual violence is a relatively new and underexplored area of study.
We aimed to perform a scoping review of the current literature on using Twitter data for researching sexual violence, elaborate on the validity of the methods, and discuss the implications and limitations of existing studies.
We performed a literature search in the following 6 databases: APA PsycInfo (Ovid), Scopus, PubMed, International Bibliography of Social Sciences (ProQuest), Criminal Justice Abstracts (EBSCO), and Communications Abstracts (EBSCO), in April 2022. The initial search identified 3759 articles that were imported into Covidence. Seven independent reviewers screened these articles following 2 steps: (1) title and abstract screening, and (2) full-text screening. The inclusion criteria were as follows: (1) empirical research, (2) focus on sexual violence, (3) analysis of Twitter data (ie, tweets or Twitter metadata), and (4) text in English. Finally, we selected 121 articles that met the inclusion criteria and coded these articles.
We coded and presented the 121 articles using Twitter-based data for sexual violence research. About 70% (89/121, 73.6%) of the articles were published in peer-reviewed journals after 2018. The reviewed articles collectively analyzed about 79.6 million tweets. The primary approaches to using Twitter as a data source were content text analysis (112/121, 92.5%) and sentiment analysis (31/121, 25.6%). Hashtags (103/121, 85.1%) were the most prominent metadata feature, followed by tweet time and date, retweets, replies, URLs, and geotags. More than a third of the articles (51/121, 42.1%) used the application programming interface to collect Twitter data. Data analyses included qualitative thematic analysis, machine learning (eg, sentiment analysis, supervised machine learning, unsupervised machine learning, and social network analysis), and quantitative analysis. Only 10.7% (13/121) of the studies discussed ethical considerations.
We described the current state of using Twitter data for sexual violence research, developed a new taxonomy describing Twitter as a data source, and evaluated the methodologies. Research recommendations include the following: development of methods for data collection and analysis, in-depth discussions about ethical norms, exploration of specific aspects of sexual violence on Twitter, examination of tweets in multiple languages, and decontextualization of Twitter data. This review demonstrates the potential of using Twitter data in sexual violence research.
学者们已经使用面对面访谈、行政系统和调查的数据来进行性暴力研究。使用 Twitter 作为性暴力研究的数据源是一个相对较新且尚未得到充分探索的研究领域。
我们旨在对当前使用 Twitter 数据研究性暴力的文献进行范围界定审查,详细阐述方法的有效性,并讨论现有研究的意义和局限性。
我们于 2022 年 4 月在以下 6 个数据库中进行了文献检索:APA PsycInfo(Ovid)、Scopus、PubMed、国际社会科学文献目录(ProQuest)、刑事司法摘要(EBSCO)和传播摘要(EBSCO)。最初的搜索确定了 3759 篇文章,这些文章被导入 Covidence。七名独立审查员按照以下 2 个步骤筛选这些文章:(1)标题和摘要筛选,以及(2)全文筛选。纳入标准如下:(1)实证研究,(2)关注性暴力,(3)分析 Twitter 数据(即推文或 Twitter 元数据),以及(4)英语文本。最后,我们选择了 121 篇符合纳入标准的文章并对这些文章进行了编码。
我们使用基于 Twitter 的性暴力研究数据对 121 篇文章进行了编码和呈现。大约 70%(89/121,73.6%)的文章是在 2018 年后在同行评议期刊上发表的。综述文章共分析了约 7960 万条推文。使用 Twitter 作为数据源的主要方法是内容文本分析(121 篇中有 112 篇,92.5%)和情感分析(31 篇中有 31 篇,25.6%)。标签(103 篇中有 103 篇,85.1%)是最突出的元数据特征,其次是推文时间和日期、转发、回复、网址和地理标记。超过三分之一的文章(51 篇中有 51 篇,42.1%)使用应用程序编程接口来收集 Twitter 数据。数据分析包括定性主题分析、机器学习(例如情感分析、监督机器学习、无监督机器学习和社交网络分析)和定量分析。只有 10.7%(13 篇中有 13 篇)的研究讨论了伦理考虑。
我们描述了使用 Twitter 数据进行性暴力研究的现状,开发了一个新的描述性社交媒体数据源的分类法,并评估了研究方法。研究建议包括以下内容:开发数据收集和分析方法、深入讨论伦理规范、探索 Twitter 上性暴力的具体方面、检查多种语言的推文以及非语境化 Twitter 数据。本综述展示了使用 Twitter 数据进行性暴力研究的潜力。