Chen Emily, Jiang Julie, Chang Ho-Chun Herbert, Muric Goran, Ferrara Emilio
Information Sciences Institute University of Southern California Marina del Rey, CA United States.
Department of Computer Science University of Southern California Los Angeles, CA United States.
JMIR Infodemiology. 2022 Feb 8;2(1):e32378. doi: 10.2196/32378. eCollection 2022 Jan-Jun.
The novel coronavirus, also known as SARS-CoV-2, has come to define much of our lives since the beginning of 2020. During this time, countries around the world imposed lockdowns and social distancing measures. The physical movements of people ground to a halt, while their online interactions increased as they turned to engaging with each other virtually. As the means of communication shifted online, information consumption also shifted online. Governing authorities and health agencies have intentionally shifted their focus to use social media and online platforms to spread factual and timely information. However, this has also opened the gate for misinformation, contributing to and accelerating the phenomenon of misinfodemics.
We carried out an analysis of Twitter discourse on over 1 billion tweets related to COVID-19 over a year to identify and investigate prevalent misinformation narratives and trends. We also aimed to describe the Twitter audience that is more susceptible to health-related misinformation and the network mechanisms driving misinfodemics.
We leveraged a data set that we collected and made public, which contained over 1 billion tweets related to COVID-19 between January 2020 and April 2021. We created a subset of this larger data set by isolating tweets that included URLs with domains that had been identified by Media Bias/Fact Check as being prone to questionable and misinformation content. By leveraging clustering and topic modeling techniques, we identified major narratives, including health misinformation and conspiracies, which were present within this subset of tweets.
Our focus was on a subset of 12,689,165 tweets that we determined were representative of COVID-19 misinformation narratives in our full data set. When analyzing tweets that shared content from domains known to be questionable or that promoted misinformation, we found that a few key misinformation narratives emerged about hydroxychloroquine and alternative medicines, US officials and governing agencies, and COVID-19 prevention measures. We further analyzed the misinformation retweet network and found that users who shared both questionable and conspiracy-related content were clustered more closely in the network than others, supporting the hypothesis that echo chambers can contribute to the spread of health misinfodemics.
We presented a summary and analysis of the major misinformation discourse surrounding COVID-19 and those who promoted and engaged with it. While misinformation is not limited to social media platforms, we hope that our insights, particularly pertaining to health-related emergencies, will help pave the way for computational infodemiology to inform health surveillance and interventions.
自2020年初以来,新型冠状病毒,即严重急性呼吸综合征冠状病毒2(SARS-CoV-2),在很大程度上影响了我们的生活。在此期间,世界各国实施了封锁和社交距离措施。人们的出行基本停滞,而线上互动却有所增加,因为大家开始转向虚拟社交。随着沟通方式转向线上,信息消费也转移到了网上。政府当局和卫生机构有意将重点转向利用社交媒体和在线平台传播真实及时的信息。然而,这也为错误信息打开了大门,助长并加速了错误信息疫情的现象。
我们对一年多来超过10亿条与新冠疫情相关的推特言论进行了分析,以识别和调查普遍存在的错误信息叙事及趋势。我们还旨在描述更容易受到健康相关错误信息影响的推特受众,以及推动错误信息疫情传播的网络机制。
我们利用了自己收集并公开的一个数据集,其中包含2020年1月至2021年4月期间超过10亿条与新冠疫情相关的推文。我们通过隔离包含特定网址的推文创建了这个更大数据集的一个子集,这些网址的域名被“媒体偏见/事实核查”认定为容易出现可疑和错误信息内容。通过利用聚类和主题建模技术,我们在这个推文子集中识别出了主要叙事,包括健康错误信息和阴谋论。
我们关注的是12689165条推文的子集,我们认定这些推文代表了完整数据集中新冠疫情错误信息叙事。在分析那些分享来自已知可疑领域内容或宣扬错误信息的推文时,我们发现出现了一些关于羟氯喹和替代药物、美国官员和政府机构以及新冠疫情预防措施的关键错误信息叙事。我们进一步分析了错误信息转发网络,发现分享可疑内容和阴谋论相关内容的用户在网络中比其他用户聚集得更紧密,这支持了回音室效应会促成健康错误信息疫情传播的假设。
我们对围绕新冠疫情的主要错误信息言论以及传播和参与这些言论的人进行了总结和分析。虽然错误信息并不局限于社交媒体平台,但我们希望我们的见解,特别是与健康相关紧急情况有关的见解,将有助于为计算信息流行病学为健康监测和干预提供信息铺平道路。