Suppr超能文献

疫情期间人们关心什么?从推特上检测关于新冠病毒的不断演变的话题。

What Are People Concerned About During the Pandemic? Detecting Evolving Topics about COVID-19 from Twitter.

作者信息

Chang Chia-Hsuan, Monselise Michal, Yang Christopher C

机构信息

College of Computing and Informatics, Drexel University, Philadelphia, PA USA.

Department of Information Management, National Sun Yat-sen University, Kaohsiung City, Taiwan.

出版信息

J Healthc Inform Res. 2021;5(1):70-97. doi: 10.1007/s41666-020-00083-3. Epub 2021 Jan 17.

Abstract

With the novel coronavirus (COVID-19) pandemic affecting the lives of the citizens of over 200 countries, there is a need for policy makers and clinicians to understand public sentiment and track the spread of the disease. One of the sources for gaining valuable insight into public sentiment is through social media. This study aims to extract this insight by producing a list of the most discussed topics regarding COVID-19 on Twitter every week and monitoring the evolution of topics from week to week. This research will propose two topic mining that can handle a large-scale dataset-rolling online non-negative matrix factorization (Rolling-ONMF) and sliding online non-negative matrix factorization (Sliding-ONMF)-and compare the insights produced by both techniques. Each algorithm produces 425 topics over the course of 17 weeks. However, topics that have not evolved from one week to the next beyond a certain evolution threshold are consolidated into a single topic. Since the topics produced by the Rolling-ONMF algorithm each week depend on the topics from the previous week, we find that the Sliding-ONMF algorithm produces more varied topics each week; however, the topics produced by the Rolling-ONMF algorithm contain keywords that appear more consistent with each other when reviewing the terms manually. We also observe that the Sliding-ONMF algorithm is able to capture events that have shorter time frames rather than ones that last throughout many months while the Rolling-ONMF algorithm detects more general themes due to a higher average evolution score which leads to more topic consolidation. We have also conducted a qualitative analysis and grouped the detected topics into themes. A number of important themes such as government policy, economic crisis, COVID-19-related updates, COVID-19-related events, prevention, vaccines and treatments, and COVID-19 testing are identified. These reflected the concerns related to the pandemic expressed in social media.

摘要

新型冠状病毒(COVID-19)大流行影响着200多个国家公民的生活,政策制定者和临床医生需要了解公众情绪并追踪疾病的传播情况。获取公众情绪有价值见解的来源之一是社交媒体。本研究旨在通过每周生成一份关于Twitter上COVID-19讨论最多的话题列表,并监测话题每周的演变情况来提取这种见解。本研究将提出两种能够处理大规模数据集的主题挖掘方法——滚动在线非负矩阵分解(Rolling-ONMF)和滑动在线非负矩阵分解(Sliding-ONMF)——并比较这两种技术产生的见解。在17周的时间里,每种算法都生成了425个主题。然而,在超过某个演变阈值后,从一周到下一周没有演变的主题会被合并为一个主题。由于Rolling-ONMF算法每周生成的主题取决于前一周的主题,我们发现Sliding-ONMF算法每周生成的主题更多样化;然而,在人工查看这些术语时,Rolling-ONMF算法生成的主题包含的关键词彼此之间显得更一致。我们还观察到,Sliding-ONMF算法能够捕捉时间框架较短的事件,而不是持续数月的事件,而Rolling-ONMF算法由于平均演变得分较高,能够检测到更一般的主题,这导致更多的主题合并。我们还进行了定性分析,并将检测到的主题分组为不同的主题。识别出了一些重要主题,如政府政策、经济危机、COVID-19相关更新、COVID-19相关事件、预防、疫苗和治疗以及COVID-19检测。这些反映了社交媒体中表达的与疫情相关的担忧。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0391/8982709/0b7605df0500/41666_2020_83_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验