• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模新冠疫情推文数据集的设计与分析

Design and analysis of a large-scale COVID-19 tweets dataset.

作者信息

Lamsal Rabindra

机构信息

School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, 110067 India.

出版信息

Appl Intell (Dordr). 2021;51(5):2790-2804. doi: 10.1007/s10489-020-02029-z. Epub 2020 Nov 6.

DOI:10.1007/s10489-020-02029-z
PMID:34764561
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7646503/
Abstract

As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset's geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets' design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively.

摘要

截至2020年7月17日,已有超过1300万人被确诊感染新型冠状病毒(COVID-19),50万人已因这种传染病丧生。2020年3月11日,世界卫生组织宣布COVID-19疫情为大流行病。自那时以来,社交媒体平台上与该疫情相关的内容呈指数级增长。过去,人们发现推特数据在提取与任何危机相关的态势感知信息方面不可或缺。本文介绍了COV19Tweets数据集(拉姆萨尔,2020a),这是一个大规模的推特数据集,包含超过3.1亿条与COVID-19相关的英语推文及其情感得分。还介绍了该数据集的地理版本,即GeoCOV19Tweets数据集(拉姆萨尔,2020b)。本文详细讨论了数据集的设计,并对两个数据集中的推文进行了分析。这些数据集已公开发布,预计它们将有助于更好地理解与当前疫情相关的公众话语的时空维度。据统计,这些数据集(拉姆萨尔,2020a,2020b)总共被访问了超过74500次。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/3195cc924c7e/10489_2020_2029_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/aff86ee02469/10489_2020_2029_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/8f5d21805274/10489_2020_2029_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/9692eba3d629/10489_2020_2029_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/7c66809ab91a/10489_2020_2029_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/96bab7c19127/10489_2020_2029_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/3ec3c8108f87/10489_2020_2029_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/a89d48d1b930/10489_2020_2029_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/e7e4aa116801/10489_2020_2029_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/3195cc924c7e/10489_2020_2029_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/aff86ee02469/10489_2020_2029_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/8f5d21805274/10489_2020_2029_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/9692eba3d629/10489_2020_2029_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/7c66809ab91a/10489_2020_2029_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/96bab7c19127/10489_2020_2029_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/3ec3c8108f87/10489_2020_2029_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/a89d48d1b930/10489_2020_2029_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/e7e4aa116801/10489_2020_2029_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b91b/7646503/3195cc924c7e/10489_2020_2029_Fig9_HTML.jpg

相似文献

1
Design and analysis of a large-scale COVID-19 tweets dataset.大规模新冠疫情推文数据集的设计与分析
Appl Intell (Dordr). 2021;51(5):2790-2804. doi: 10.1007/s10489-020-02029-z. Epub 2020 Nov 6.
2
MonkeyPox2022Tweets: A Large-Scale Twitter Dataset on the 2022 Monkeypox Outbreak, Findings from Analysis of Tweets, and Open Research Questions.猴痘2022年推文:关于2022年猴痘疫情的大规模推特数据集、推文分析结果及开放性研究问题
Infect Dis Rep. 2022 Nov 14;14(6):855-883. doi: 10.3390/idr14060087.
3
Tracking discussions of complementary, alternative, and integrative medicine in the context of the COVID-19 pandemic: a month-by-month sentiment analysis of Twitter data.在 COVID-19 大流行背景下追踪补充、替代和整合医学的讨论:对 Twitter 数据进行逐月情感分析。
BMC Complement Med Ther. 2022 Apr 13;22(1):105. doi: 10.1186/s12906-022-03586-1.
4
Twitter conversations predict the daily confirmed COVID-19 cases.推特对话可预测每日新增新冠肺炎确诊病例。
Appl Soft Comput. 2022 Nov;129:109603. doi: 10.1016/j.asoc.2022.109603. Epub 2022 Sep 5.
5
Topics, Trends, and Sentiments of Tweets About the COVID-19 Pandemic: Temporal Infoveillance Study.关于新冠疫情的推文主题、趋势和情绪:时间信息监测研究
J Med Internet Res. 2020 Oct 23;22(10):e22624. doi: 10.2196/22624.
6
Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set.追踪社交媒体上关于 COVID-19 大流行的讨论:公共冠状病毒 Twitter 数据集的开发。
JMIR Public Health Surveill. 2020 May 29;6(2):e19273. doi: 10.2196/19273.
7
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.
8
An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.一个用于研究新冠疫情信息疫情的增强型多语言推特数据集。
Soc Netw Anal Min. 2021;11(1):102. doi: 10.1007/s13278-021-00825-0. Epub 2021 Oct 20.
9
BillionCOV: An enriched billion-scale collection of COVID-19 tweets for efficient hydration.BillionCOV:一个经过富集的十亿规模的新冠疫情推文集合,用于高效的信息补充。
Data Brief. 2023 Jun;48:109229. doi: 10.1016/j.dib.2023.109229. Epub 2023 May 12.
10
Social Media Insights Into US Mental Health During the COVID-19 Pandemic: Longitudinal Analysis of Twitter Data.社交媒体洞察美国在 COVID-19 大流行期间的心理健康状况:对 Twitter 数据的纵向分析。
J Med Internet Res. 2020 Dec 14;22(12):e21418. doi: 10.2196/21418.

引用本文的文献

1
Psychological crisis and emergency response in public health emergencies: a case study of the Mpox epidemic.突发公共卫生事件中的心理危机与应急响应:以猴痘疫情为例
BMC Psychol. 2025 Aug 22;13(1):956. doi: 10.1186/s40359-025-03309-4.
2
Dissecting the infodemic: An in-depth analysis of COVID-19 misinformation detection on X (formerly Twitter) utilizing machine learning and deep learning techniques.剖析信息疫情:利用机器学习和深度学习技术对X(原推特)上新冠疫情错误信息检测的深入分析。
Heliyon. 2024 Sep 12;10(18):e37760. doi: 10.1016/j.heliyon.2024.e37760. eCollection 2024 Sep 30.
3
Examining media's coverage of COVID-19 vaccines and social media sentiments on vaccine manufacturers' stock prices.

本文引用的文献

1
A Large-Scale COVID-19 Twitter Chatter Dataset for Open Scientific Research-An International Collaboration.用于开放科学研究的大规模COVID-19推特聊天数据集——一项国际合作。
Epidemiologia (Basel). 2021 Aug 5;2(3):315-324. doi: 10.3390/epidemiologia2030024.
2
Sentiment Analysis and Emotion Understanding during the COVID-19 Pandemic in Spain and Its Impact on Digital Ecosystems.新冠疫情期间西班牙的情绪分析和情感理解及其对数字生态系统的影响。
Int J Environ Res Public Health. 2020 Jul 31;17(15):5542. doi: 10.3390/ijerph17155542.
3
Examining the Impact of COVID-19 Lockdown in Wuhan and Lombardy: A Psycholinguistic Analysis on Weibo and Twitter.
研究媒体对 COVID-19 疫苗的报道和社交媒体对疫苗制造商股票价格的情绪。
Front Public Health. 2024 Aug 13;12:1411345. doi: 10.3389/fpubh.2024.1411345. eCollection 2024.
4
COVIDHealth: A novel labeled dataset and machine learning-based web application for classifying COVID-19 discourses on Twitter.COVIDHealth:一个用于对推特上关于COVID-19的言论进行分类的新型标记数据集和基于机器学习的网络应用程序。
Heliyon. 2024 Jul 8;10(14):e34103. doi: 10.1016/j.heliyon.2024.e34103. eCollection 2024 Jul 30.
5
Holiday Tweets: A Qualitative Analysis of How Major Health Organizations Addressed Culture During the COVID-19 Pandemic.假日推文:对主要卫生组织在 COVID-19 大流行期间如何处理文化问题的定性分析。
Inquiry. 2024 Jan-Dec;61:469580241266346. doi: 10.1177/00469580241266346.
6
Social media users' perceptions about health mis- and disinformation on social media.社交媒体用户对社交媒体上健康错误信息和虚假信息的认知。
Health Aff Sch. 2023 Oct;1(4). doi: 10.1093/haschl/qxad050. Epub 2023 Sep 26.
7
Topics in Antivax and Provax Discourse: Yearlong Synoptic Study of COVID-19 Vaccine Tweets.抗疫苗和支持疫苗言论主题:对 COVID-19 疫苗推文的全年综合研究。
J Med Internet Res. 2023 Aug 8;25:e45069. doi: 10.2196/45069.
8
Deep learning for COVID-19 topic modelling via Twitter: Alpha, Delta and Omicron.通过 Twitter 进行 COVID-19 主题建模的深度学习:Alpha、Delta 和 Omicron。
PLoS One. 2023 Aug 1;18(8):e0288681. doi: 10.1371/journal.pone.0288681. eCollection 2023.
9
A Survey on COVID-19 Data Analysis Using AI, IoT, and Social Media.人工智能、物联网和社交媒体在新冠病毒数据分析中的应用研究综述。
Sensors (Basel). 2023 Jun 13;23(12):5543. doi: 10.3390/s23125543.
10
Critical reflections on three popular computational linguistic approaches to examine Twitter discourses.对三种用于研究推特话语的流行计算语言学方法的批判性反思。
PeerJ Comput Sci. 2023 Jan 30;9:e1211. doi: 10.7717/peerj-cs.1211. eCollection 2023.
考察武汉和伦巴第 COVID-19 封锁的影响:基于微博和推特的心理语言学分析。
Int J Environ Res Public Health. 2020 Jun 24;17(12):4552. doi: 10.3390/ijerph17124552.
4
Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set.追踪社交媒体上关于 COVID-19 大流行的讨论:公共冠状病毒 Twitter 数据集的开发。
JMIR Public Health Surveill. 2020 May 29;6(2):e19273. doi: 10.2196/19273.
5
Global Sentiments Surrounding the COVID-19 Pandemic on Twitter: Analysis of Twitter Trends.全球社交媒体推特上的新冠大流行情绪:推特趋势分析。
JMIR Public Health Surveill. 2020 May 22;6(2):e19447. doi: 10.2196/19447.
6
COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data.新冠疫情与5G阴谋论:基于推特数据的社交网络分析
J Med Internet Res. 2020 May 6;22(5):e19458. doi: 10.2196/19458.
7
Conversations and Medical News Frames on Twitter: Infodemiological Study on COVID-19 in South Korea.推特上的对话与医学新闻框架:韩国新冠肺炎信息流行病学研究
J Med Internet Res. 2020 May 5;22(5):e18897. doi: 10.2196/18897.
8
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.
9
Effects of Social Grooming on Incivility in COVID-19.社会交往对新冠疫情中不文明行为的影响。
Cyberpsychol Behav Soc Netw. 2020 Aug;23(8):519-525. doi: 10.1089/cyber.2020.0201. Epub 2020 Apr 3.
10
Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.中国武汉地区 2019 年新型冠状病毒感染患者的临床特征。
Lancet. 2020 Feb 15;395(10223):497-506. doi: 10.1016/S0140-6736(20)30183-5. Epub 2020 Jan 24.