监督机器学习模型在新冠病毒推文情感分析中的性能比较。

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.

机构信息

Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan.

Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur, Punjab, Pakistan.

出版信息

PLoS One. 2021 Feb 25;16(2):e0245909. doi: 10.1371/journal.pone.0245909. eCollection 2021.

DOI:10.1371/journal.pone.0245909

PMID:33630869

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7906356/

Abstract

The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.

摘要

Covid-19 的传播引起了全球的健康关注。社交媒体越来越多地被用于分享有关它的新闻和观点。为了优化和合理利用资源，有必要对疫情进行现实评估。在这项研究中，我们使用有监督的机器学习方法对新冠疫情推文进行情感分析。从推文中识别新冠疫情的情绪，可以为更好地应对当前的大流行情况做出明智的决策。所使用的数据集是使用 IEEE 数据端口提供的 ID 从 Twitter 上提取的。使用 Tweepy 库的内部构建爬虫提取推文。使用预处理技术对数据集进行清理，并使用 TextBlob 库提取情绪。这项工作的贡献在于使用我们提出的特征集对各种机器学习分类器进行性能评估。该集合由词袋和词频逆文档频率连接而成。推文被分类为积极、中立或消极。分类器的性能是根据准确性、精度、召回率和 F1 得分进行评估的。为了完整性，还使用深度学习模型的长短时记忆 (LSTM) 架构对数据集进行了进一步调查。结果表明，使用我们提出的连接特征集，随机森林分类器的性能优于所有其他模型，准确率达到 0.93。LSTM 的准确率与机器学习分类器相比较低。为了展示我们提出的特征集的有效性，将结果与基于 GloVe 特征提取方法的 Vader 情感分析技术进行了比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9232/7906356/cff6588a45f7/pone.0245909.g001.jpg

相似文献

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.监督机器学习模型在新冠病毒推文情感分析中的性能比较。

PLoS One. 2021 Feb 25;16(2):e0245909. doi: 10.1371/journal.pone.0245909. eCollection 2021.

Front Public Health. 2022 Jan 14;9:812735. doi: 10.3389/fpubh.2021.812735. eCollection 2021.

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.机器学习分类器在电子烟 Twitter 监测中的应用：比较机器学习研究。

J Med Internet Res. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478.

Deep Learning-Based Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Data.基于深度学习的从推特数据中分析 COVID-19 疫苗接种反应的情绪。

Comput Math Methods Med. 2021 Dec 2;2021:4321131. doi: 10.1155/2021/4321131. eCollection 2021.

Digital Epidemiology of Prescription Drug References on X (Formerly Twitter): Neural Network Topic Modeling and Sentiment Analysis.X（前身为 Twitter）上处方药引用的数字流行病学：神经网络主题建模和情感分析。

J Med Internet Res. 2024 Aug 23;26:e57885. doi: 10.2196/57885.

Tracking Public Attitudes Toward COVID-19 Vaccination on Tweets in Canada: Using Aspect-Based Sentiment Analysis.追踪加拿大推特上公众对 COVID-19 疫苗接种的态度：使用基于方面的情感分析。

J Med Internet Res. 2022 Mar 29;24(3):e35016. doi: 10.2196/35016.

Tracking discussions of complementary, alternative, and integrative medicine in the context of the COVID-19 pandemic: a month-by-month sentiment analysis of Twitter data.在 COVID-19 大流行背景下追踪补充、替代和整合医学的讨论：对 Twitter 数据进行逐月情感分析。

BMC Complement Med Ther. 2022 Apr 13;22(1):105. doi: 10.1186/s12906-022-03586-1.

An optimistic firefly algorithm-based deep learning approach for sentiment analysis of COVID-19 tweets.基于萤火虫算法的深度学习方法在 COVID-19 推文情感分析中的应用。

Math Biosci Eng. 2023 Jan;20(2):2382-2407. doi: 10.3934/mbe.2023112. Epub 2022 Nov 21.

Vaccine sentiment analysis using BERT + NBSVM and geo-spatial approaches.使用BERT + NBSVM和地理空间方法的疫苗情绪分析。

J Supercomput. 2023 May 7:1-31. doi: 10.1007/s11227-023-05319-8.

Sentiment Analysis of Arabic Tweets Regarding Distance Learning in Saudi Arabia during the COVID-19 Pandemic.新冠疫情期间沙特阿拉伯远程学习的阿拉伯推文的情感分析。

Sensors (Basel). 2021 Aug 11;21(16):5431. doi: 10.3390/s21165431.

引用本文的文献

Sentiment analysis for deepfake X posts using novel transfer learning based word embedding and hybrid LGR approach.使用基于新型迁移学习的词嵌入和混合LGR方法对深度伪造X帖子进行情感分析。

Sci Rep. 2025 Aug 3;15(1):28305. doi: 10.1038/s41598-025-10661-3.

Evaluating sentiment analysis models: A comparative analysis of vaccination tweets during the COVID-19 phase leveraging DistilBERT for enhanced insights.评估情感分析模型：利用DistilBERT对新冠疫情期间的疫苗接种推文进行比较分析以增强见解。

MethodsX. 2025 May 30;14:103407. doi: 10.1016/j.mex.2025.103407. eCollection 2025 Jun.

Mapping the "X" Debate: Water Fluoridation Sentiment Analysis With Advanced Machine Learning.剖析“X”辩论：运用先进机器学习进行水氟化处理情感分析

J Public Health Dent. 2025 Sep;85(3):231-243. doi: 10.1111/jphd.12669. Epub 2025 May 7.

Exploring Topics, Emotions, and Sentiments in Health Organization Posts and Public Responses on Instagram: Content Analysis.探索健康组织在Instagram上发布的内容以及公众回应中的主题、情感和情绪：内容分析

JMIR Infodemiology. 2025 May 2;5:e70576. doi: 10.2196/70576.

A Multimodal Pain Sentiment Analysis System Using Ensembled Deep Learning Approaches for IoT-Enabled Healthcare Framework.一种使用集成深度学习方法的多模态疼痛情感分析系统，用于支持物联网的医疗保健框架。

Sensors (Basel). 2025 Feb 17;25(4):1223. doi: 10.3390/s25041223.

An Intelligent System for Classifying Patient Complaints Using Machine Learning and Natural Language Processing: Development and Validation Study.一种使用机器学习和自然语言处理对患者投诉进行分类的智能系统：开发与验证研究。

J Med Internet Res. 2025 Jan 8;27:e55721. doi: 10.2196/55721.

Differences in Fear and Negativity Levels Between Formal and Informal Health-Related Websites: Analysis of Sentiments and Emotions.正式与非正式健康相关网站之间的恐惧和消极水平差异：情感分析。

J Med Internet Res. 2024 Aug 9;26:e55151. doi: 10.2196/55151.

Sentiment Analysis of Social Media Data on Ebola Outbreak Using Deep Learning Classifiers.使用深度学习分类器对埃博拉疫情社交媒体数据进行情感分析

Life (Basel). 2024 May 30;14(6):708. doi: 10.3390/life14060708.

Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers.使用机器学习分类器对推特上的金融推特帖子进行情感分析。

Heliyon. 2023 Dec 17;10(1):e23784. doi: 10.1016/j.heliyon.2023.e23784. eCollection 2024 Jan 15.

Improving the Forecasting Accuracy Based on the Lunar Calendar in Modeling Rainfall Levels Using the Bi-LSTM Method through the Grid Search Approach.通过网格搜索法基于农历改进双向长短期记忆网络（Bi-LSTM）方法在模拟降雨水平时的预测精度。

ScientificWorldJournal. 2023 Dec 31;2023:1863346. doi: 10.1155/2023/1863346. eCollection 2023.

本文引用的文献

Work from home during the COVID-19 pandemic: An observational study based on a large geo-tagged COVID-19 Twitter dataset (UsaGeoCov19).2019冠状病毒病大流行期间的居家办公：基于一个大型带有地理标签的2019冠状病毒病推特数据集（美国地理新冠19）的观察性研究

Inf Process Manag. 2022 Mar;59(2):102820. doi: 10.1016/j.ipm.2021.102820. Epub 2021 Dec 9.

Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning on COVID-19 Related Tweets.基于情感分析和深度学习对新冠疫情相关推文进行跨文化极性与情感检测

IEEE Access. 2020 Sep 28;8:181074-181090. doi: 10.1109/ACCESS.2020.3027350. eCollection 2020.

COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model.新冠疫情感知：基于BERT模型对中国社交媒体的负面情绪分析

IEEE Access. 2020 Jul 28;8:138162-138169. doi: 10.1109/ACCESS.2020.3012595. eCollection 2020.

Unmasking the conversation on masks: Natural language processing for topical sentiment analysis of COVID-19 Twitter discourse.揭开口罩话题的面纱：针对 COVID-19 推特话语的主题情感分析的自然语言处理。

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:555-564. eCollection 2021.

The COVID-19 social media infodemic.新冠病毒肺炎疫情相关社交媒体信息疫情。

Sci Rep. 2020 Oct 6;10(1):16598. doi: 10.1038/s41598-020-73510-5.

Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers-A study to show how popularity is affecting accuracy in social media.基于深度学习分类器的新冠疫情推文情感分析——一项展示社交媒体中热度如何影响准确性的研究

Appl Soft Comput. 2020 Dec;97:106754. doi: 10.1016/j.asoc.2020.106754. Epub 2020 Sep 28.

Deep Sentiment Classification and Topic Discovery on Novel Coronavirus or COVID-19 Online Discussions: NLP Using LSTM Recurrent Neural Network Approach.基于 LSTM 循环神经网络的自然语言处理在新型冠状病毒在线讨论中的深度情感分类和主题发现

IEEE J Biomed Health Inform. 2020 Oct;24(10):2733-2742. doi: 10.1109/JBHI.2020.3001216. Epub 2020 Jun 9.

Estimating and projecting air passenger traffic during the COVID-19 coronavirus outbreak and its socio-economic impact.评估和预测2019冠状病毒病疫情期间的航空客运量及其社会经济影响。

Saf Sci. 2020 Sep;129:104791. doi: 10.1016/j.ssci.2020.104791. Epub 2020 May 6.

Sentiment analysis of nationwide lockdown due to COVID 19 outbreak: Evidence from India.关于因新冠疫情爆发实施全国封锁的情绪分析：来自印度的证据。

Asian J Psychiatr. 2020 Jun;51:102089. doi: 10.1016/j.ajp.2020.102089. Epub 2020 Apr 12.

Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧：信息监测研究

J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

监督机器学习模型在新冠病毒推文情感分析中的性能比较。

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献