Xue Jia, Chen Junxiang, Hu Ran, Chen Chen, Zheng Chengda, Su Yue, Zhu Tingshao
Factor-Inwentash Faculty of Social Work, University of Toronto, Toronto, ON, Canada.
Faculty of Information, University of Toronto, Toronto, ON, Canada.
J Med Internet Res. 2020 Nov 25;22(11):e20550. doi: 10.2196/20550.
It is important to measure the public response to the COVID-19 pandemic. Twitter is an important data source for infodemiology studies involving public response monitoring.
The objective of this study is to examine COVID-19-related discussions, concerns, and sentiments using tweets posted by Twitter users.
We analyzed 4 million Twitter messages related to the COVID-19 pandemic using a list of 20 hashtags (eg, "coronavirus," "COVID-19," "quarantine") from March 7 to April 21, 2020. We used a machine learning approach, Latent Dirichlet Allocation (LDA), to identify popular unigrams and bigrams, salient topics and themes, and sentiments in the collected tweets.
Popular unigrams included "virus," "lockdown," and "quarantine." Popular bigrams included "COVID-19," "stay home," "corona virus," "social distancing," and "new cases." We identified 13 discussion topics and categorized them into 5 different themes: (1) public health measures to slow the spread of COVID-19, (2) social stigma associated with COVID-19, (3) COVID-19 news, cases, and deaths, (4) COVID-19 in the United States, and (5) COVID-19 in the rest of the world. Across all identified topics, the dominant sentiments for the spread of COVID-19 were anticipation that measures can be taken, followed by mixed feelings of trust, anger, and fear related to different topics. The public tweets revealed a significant feeling of fear when people discussed new COVID-19 cases and deaths compared to other topics.
This study showed that Twitter data and machine learning approaches can be leveraged for an infodemiology study, enabling research into evolving public discussions and sentiments during the COVID-19 pandemic. As the situation rapidly evolves, several topics are consistently dominant on Twitter, such as confirmed cases and death rates, preventive measures, health authorities and government policies, COVID-19 stigma, and negative psychological reactions (eg, fear). Real-time monitoring and assessment of Twitter discussions and concerns could provide useful data for public health emergency responses and planning. Pandemic-related fear, stigma, and mental health concerns are already evident and may continue to influence public trust when a second wave of COVID-19 occurs or there is a new surge of the current pandemic.
衡量公众对新冠疫情的反应很重要。推特是信息流行病学研究中监测公众反应的重要数据源。
本研究的目的是利用推特用户发布的推文来审视与新冠疫情相关的讨论、担忧和情绪。
我们使用20个主题标签(如“冠状病毒”“新冠疫情”“隔离”)的列表,分析了2020年3月7日至4月21日期间400万条与新冠疫情相关的推特消息。我们采用机器学习方法——潜在狄利克雷分配(LDA),来识别所收集推文中的热门单字和双字、显著主题以及情绪。
热门单字包括“病毒”“封锁”和“隔离”。热门双字包括“新冠疫情”“居家”“冠状病毒”“社交距离”和“新增病例”。我们识别出13个讨论主题,并将它们分为5个不同主题:(1)减缓新冠疫情传播的公共卫生措施;(2)与新冠疫情相关的社会污名;(3)新冠疫情新闻、病例和死亡情况;(4)美国的新冠疫情;(5)世界其他地区的新冠疫情。在所有识别出的主题中,对新冠疫情传播的主要情绪是预期可以采取措施,其次是与不同主题相关的信任、愤怒和恐惧等复杂情绪。与其他主题相比,当人们讨论新冠疫情新增病例和死亡情况时,公众推文显示出明显的恐惧情绪。
本研究表明,推特数据和机器学习方法可用于信息流行病学研究,有助于研究新冠疫情期间不断演变的公众讨论和情绪。随着情况迅速演变,推特上几个主题一直占据主导地位,如确诊病例和死亡率、预防措施、卫生当局和政府政策、新冠疫情污名以及负面心理反应(如恐惧)。对推特讨论和担忧进行实时监测和评估可为公共卫生应急响应和规划提供有用数据。与疫情相关的恐惧、污名和心理健康问题已经很明显,在新冠疫情第二波来袭或当前疫情出现新的激增时,可能会继续影响公众信任。