Department of Anesthesiology and Division of Infectious Diseases and Global Public Health, University of California San Diego School of Medicine, La Jolla, CA, United States.
S-3 Research LLC, San Diego, CA, United States.
JMIR Public Health Surveill. 2020 Apr 21;6(2):e18700. doi: 10.2196/18700.
The coronavirus disease (COVID-19) pandemic, which began in Wuhan, China in December 2019, is rapidly spreading worldwide with over 1.9 million cases as of mid-April 2020. Infoveillance approaches using social media can help characterize disease distribution and public knowledge, attitudes, and behaviors critical to the early stages of an outbreak.
The aim of this study is to conduct a quantitative and qualitative assessment of Chinese social media posts originating in Wuhan City on the Chinese microblogging platform Weibo during the early stages of the COVID-19 outbreak.
Chinese-language messages from Wuhan were collected for 39 days between December 23, 2019, and January 30, 2020, on Weibo. For quantitative analysis, the total daily cases of COVID-19 in Wuhan were obtained from the Chinese National Health Commission, and a linear regression model was used to determine if Weibo COVID-19 posts were predictive of the number of cases reported. Qualitative content analysis and an inductive manual coding approach were used to identify parent classifications of news and user-generated COVID-19 topics.
A total of 115,299 Weibo posts were collected during the study time frame consisting of an average of 2956 posts per day (minimum 0, maximum 13,587). Quantitative analysis found a positive correlation between the number of Weibo posts and the number of reported cases from Wuhan, with approximately 10 more COVID-19 cases per 40 social media posts (P<.001). This effect size was also larger than what was observed for the rest of China excluding Hubei Province (where Wuhan is the capital city) and held when comparing the number of Weibo posts to the incidence proportion of cases in Hubei Province. Qualitative analysis of 11,893 posts during the first 21 days of the study period with COVID-19-related posts uncovered four parent classifications including Weibo discussions about the causative agent of the disease, changing epidemiological characteristics of the outbreak, public reaction to outbreak control and response measures, and other topics. Generally, these themes also exhibited public uncertainty and changing knowledge and attitudes about COVID-19, including posts exhibiting both protective and higher-risk behaviors.
The results of this study provide initial insight into the origins of the COVID-19 outbreak based on quantitative and qualitative analysis of Chinese social media data at the initial epicenter in Wuhan City. Future studies should continue to explore the utility of social media data to predict COVID-19 disease severity, measure public reaction and behavior, and evaluate effectiveness of outbreak communication.
2019 年 12 月,中国武汉爆发了新型冠状病毒病(COVID-19)疫情,目前已迅速蔓延至全球,截至 2020 年 4 月中旬,全球已有超过 190 万例病例。利用社交媒体进行传染病监测方法可以帮助描述疾病分布以及对暴发早期至关重要的公众知识、态度和行为。
本研究旨在对 2019 年 12 月 23 日至 2020 年 1 月 30 日期间武汉城市的中国社交媒体帖子进行定量和定性评估,该研究在中国微博平台上进行。
在 COVID-19 爆发初期,我们在微博上收集了 39 天的武汉中文消息。为了进行定量分析,我们从中国国家卫生健康委员会获得了武汉市 COVID-19 病例的每日总数,并使用线性回归模型确定微博 COVID-19 帖子是否可以预测报告的病例数。我们使用定性内容分析和归纳性手动编码方法确定新闻和用户生成的 COVID-19 主题的主要分类。
在研究期间共收集了 115299 条微博帖子,平均每天有 2956 条帖子(最少 0 条,最多 13587 条)。定量分析发现,微博帖子数量与武汉报告病例数量之间存在正相关关系,每增加 40 条社交媒体帖子,COVID-19 病例数就会增加约 10 例(P<.001)。这种效应大小也大于除湖北省(武汉市是湖北省的省会)以外的中国其他地区,当将微博帖子数量与湖北省病例的发病率进行比较时,这种效应大小仍然成立。在研究的前 21 天中,对 11893 条与 COVID-19 相关的帖子进行了定性分析,发现了四个主要分类,包括微博上有关疾病病原体的讨论、暴发的流行病学特征变化、公众对疫情控制和应对措施的反应以及其他主题。通常,这些主题也表现出公众对 COVID-19 的不确定性和不断变化的知识和态度,包括表现出保护和更高风险行为的帖子。
这项研究通过对武汉市疫情初期的中国社交媒体数据进行定量和定性分析,提供了有关 COVID-19 疫情起源的初步见解。未来的研究应继续探索利用社交媒体数据预测 COVID-19 疾病严重程度,衡量公众反应和行为以及评估疫情传播沟通效果的方法。