• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估情感分析模型:利用DistilBERT对新冠疫情期间的疫苗接种推文进行比较分析以增强见解。

Evaluating sentiment analysis models: A comparative analysis of vaccination tweets during the COVID-19 phase leveraging DistilBERT for enhanced insights.

作者信息

Agrawal Renuka, Majumder Mehuli, Yadav Ishita, Taneja Nandini, Hamdare Safa, Hemnani Preeti

机构信息

Symbiosis Institute of Technology - Pune Campus, Symbiosis International (Deemed University), Pune, India.

Nottingham Trent University-Cliffton Campus, Nottingham, UK.

出版信息

MethodsX. 2025 May 30;14:103407. doi: 10.1016/j.mex.2025.103407. eCollection 2025 Jun.

DOI:10.1016/j.mex.2025.103407
PMID:40529516
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12171565/
Abstract

This study investigates public sentiment toward COVID-19 vaccinations by analyzing Twitter data using advanced machine learning (ML) and natural language processing (NLP) techniques. Recognizing social media as a valuable source for gauging public opinion during health crises, the research aims to inform policies on content moderation and misinformation control.•Comparative Analysis of Embedding Techniques and ML Models: The study evaluates two embedding techniques-TF-IDF and Word2Vec-across five ML models: LinearSVC, Random Forest, Gradient Boosting Machine (GBM), XGBoost, and AdaBoost.•The models were tested using two training-testing splits (70-30 and 80-20) to assess their performance on noisy, unlabeled, and imbalanced sentiment data.•Utilization of DistilBERT for Pseudo-Labeling: To enhance labeling accuracy, DistilBERT was employed for pseudo-labeling, capturing semantic nuances often missed by traditional ML techniques. This approach enabled more effective sentiment classification of tweets. The findings underscore the effectiveness of automated annotation, hybrid modeling, and embedding strategies in analyzing unstructured social media data. Such approaches provide valuable insights for public health applications, particularly in understanding vaccine hesitancy and shaping communication strategies. The study highlights the potential of integrating advanced NLP techniques to better comprehend and respond to public sentiments during pandemics or similar emergencies.

摘要

本研究通过使用先进的机器学习(ML)和自然语言处理(NLP)技术分析推特数据,调查公众对新冠疫苗接种的情绪。该研究认识到社交媒体是在健康危机期间衡量公众舆论的宝贵来源,旨在为内容审核和错误信息控制政策提供参考。•嵌入技术和ML模型的比较分析:该研究在五个ML模型(线性支持向量分类器、随机森林、梯度提升机(GBM)、极端梯度提升和自适应增强)中评估了两种嵌入技术——词频-逆文档频率和词向量。•使用两种训练-测试划分(70-30和80-20)对模型进行测试,以评估它们在嘈杂、未标记和不平衡的情绪数据上的性能。•利用DistilBERT进行伪标签:为了提高标签准确性,采用DistilBERT进行伪标签,捕捉传统ML技术经常遗漏的语义细微差别。这种方法使推文的情感分类更有效。研究结果强调了自动标注、混合建模和嵌入策略在分析非结构化社交媒体数据方面的有效性。这些方法为公共卫生应用提供了有价值的见解,特别是在理解疫苗犹豫和制定沟通策略方面。该研究强调了整合先进NLP技术以更好地理解和应对大流行或类似紧急情况期间公众情绪的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/5fdd52426cbf/gr21.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7aaaf186ed10/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/dd564555fbfe/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/53569e1df9cb/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/da1874d25620/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/f31105026f17/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/2c13fe24986c/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7715872a9677/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/78f43acf4554/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/1d2c1f9d84bf/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/2667fc02cbf7/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/e4ebfe321614/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/47a0113fd7bb/gr11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/6c68c91d03d1/gr12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/a975cdabebf4/gr13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/de05db6bd67f/gr14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7c0c1a27bb93/gr15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/dd2458030023/gr16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/1a98953af451/gr17.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/c2efacd3e011/gr18.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/ae1f121f0e21/gr19.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/4d1d3709d624/gr20.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/5fdd52426cbf/gr21.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7aaaf186ed10/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/dd564555fbfe/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/53569e1df9cb/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/da1874d25620/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/f31105026f17/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/2c13fe24986c/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7715872a9677/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/78f43acf4554/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/1d2c1f9d84bf/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/2667fc02cbf7/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/e4ebfe321614/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/47a0113fd7bb/gr11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/6c68c91d03d1/gr12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/a975cdabebf4/gr13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/de05db6bd67f/gr14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/7c0c1a27bb93/gr15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/dd2458030023/gr16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/1a98953af451/gr17.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/c2efacd3e011/gr18.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/ae1f121f0e21/gr19.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/4d1d3709d624/gr20.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d4c/12171565/5fdd52426cbf/gr21.jpg

相似文献

1
Evaluating sentiment analysis models: A comparative analysis of vaccination tweets during the COVID-19 phase leveraging DistilBERT for enhanced insights.评估情感分析模型:利用DistilBERT对新冠疫情期间的疫苗接种推文进行比较分析以增强见解。
MethodsX. 2025 May 30;14:103407. doi: 10.1016/j.mex.2025.103407. eCollection 2025 Jun.
2
Sentiment Analysis Using a Large Language Model-Based Approach to Detect Opioids Mixed With Other Substances Via Social Media: Method Development and Validation.使用基于大语言模型的方法通过社交媒体检测与其他物质混合的阿片类药物的情感分析:方法开发与验证
JMIR Infodemiology. 2025 Jun 19;5:e70525. doi: 10.2196/70525.
3
Sentiment analysis in multilingual context: Comparative analysis of machine learning and hybrid deep learning models.多语言环境下的情感分析:机器学习与混合深度学习模型的比较分析
Heliyon. 2023 Sep 19;9(9):e20281. doi: 10.1016/j.heliyon.2023.e20281. eCollection 2023 Sep.
4
Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study.使用检索增强大语言模型预测术后30天死亡率和美国麻醉医师协会身体状况:开发与验证研究
J Med Internet Res. 2025 Jun 3;27:e75052. doi: 10.2196/75052.
5
Predicting patients' sentiments about medications using artificial intelligence techniques.使用人工智能技术预测患者对药物的看法。
Sci Rep. 2024 Dec 30;14(1):31928. doi: 10.1038/s41598-024-83222-9.
6
The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review.机器学习在疾病预测与管理中分析真实世界数据的应用:系统评价
JMIR Med Inform. 2025 Jun 19;13:e68898. doi: 10.2196/68898.
7
The emotions of Chinese netizens toward the opening-up policies for COVID-19: panic, trust, and acceptance.中国网民对新冠疫情防控放开政策的情绪:恐慌、信任与接受。
Front Public Health. 2025 Jan 10;12:1489006. doi: 10.3389/fpubh.2024.1489006. eCollection 2024.
8
The dawn of a new era: can machine learning and large language models reshape QSP modeling?新时代的曙光:机器学习和大语言模型能否重塑定量系统药理学建模?
J Pharmacokinet Pharmacodyn. 2025 Jun 16;52(4):36. doi: 10.1007/s10928-025-09984-5.
9
Enhanced cardiovascular risk prediction in the Western Pacific: A machine learning approach tailored to the Malaysian population.西太平洋地区心血管疾病风险预测的增强:一种针对马来西亚人群的机器学习方法。
PLoS One. 2025 Jun 17;20(6):e0323949. doi: 10.1371/journal.pone.0323949. eCollection 2025.
10
Idiographic Lapse Prediction With State Space Modeling: Algorithm Development and Validation Study.基于状态空间模型的个性化失误预测:算法开发与验证研究
JMIR Form Res. 2025 Jun 3;9:e73265. doi: 10.2196/73265.

本文引用的文献

1
Sentiment Analysis of Twitter Posts Related to a COVID-19 Test & Trace Program in NYC.推特上与纽约市 COVID-19 检测和追踪项目相关的帖子的情绪分析。
J Urban Health. 2024 Oct;101(5):898-901. doi: 10.1007/s11524-024-00906-3. Epub 2024 Sep 26.
2
Enhancing public health response: a framework for topics and sentiment analysis of COVID-19 in the UK using Twitter and the embedded topic model.增强公共卫生应对能力:利用 Twitter 和嵌入式主题模型分析英国 COVID-19 主题和情绪的框架。
Front Public Health. 2024 Feb 21;12:1105383. doi: 10.3389/fpubh.2024.1105383. eCollection 2024.
3
A novel fusion-based deep learning model for sentiment analysis of COVID-19 tweets.
一种用于新冠疫情推文情感分析的新型基于融合的深度学习模型。
Knowl Based Syst. 2021 Sep 27;228:107242. doi: 10.1016/j.knosys.2021.107242. Epub 2021 Jun 25.
4
Towards Transfer Learning Techniques-BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study.面向迁移学习技术——BERT、DistilBERT、BERTimbau 和 DistilBERTimbau 用于来自不同语言的自动文本分类:案例研究。
Sensors (Basel). 2022 Oct 26;22(21):8184. doi: 10.3390/s22218184.
5
Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset.新冠病毒-19疫苗犹豫:基于新冠病毒-19疫苗接种推特数据集的文本挖掘、情感分析与机器学习
Expert Syst Appl. 2023 Feb;212:118715. doi: 10.1016/j.eswa.2022.118715. Epub 2022 Sep 5.
6
Social media sentiment analysis to monitor the performance of vaccination coverage during the early phase of the national COVID-19 vaccine rollout.社交媒体情绪分析监测全国 COVID-19 疫苗推广初期的疫苗接种覆盖率表现。
Comput Methods Programs Biomed. 2022 Jun;221:106838. doi: 10.1016/j.cmpb.2022.106838. Epub 2022 Apr 27.
7
Sentiment analysis and topic modeling for COVID-19 vaccine discussions.针对新冠疫苗讨论的情感分析与主题建模
World Wide Web. 2022;25(3):1067-1083. doi: 10.1007/s11280-022-01029-y. Epub 2022 Feb 25.
8
A Proposed Sentiment Analysis Deep Learning Algorithm for Analyzing COVID-19 Tweets.一种用于分析新冠疫情推文的情感分析深度学习算法提案
Inf Syst Front. 2021;23(6):1417-1429. doi: 10.1007/s10796-021-10135-7. Epub 2021 Apr 20.
9
A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.监督机器学习模型在新冠病毒推文情感分析中的性能比较。
PLoS One. 2021 Feb 25;16(2):e0245909. doi: 10.1371/journal.pone.0245909. eCollection 2021.