• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

理解意大利语推文中的疫苗立场,并通过 COVID-19 大流行解决语言变化问题:机器学习模型的开发和验证。

Understanding the vaccine stance of Italian tweets and addressing language changes through the COVID-19 pandemic: Development and validation of a machine learning model.

机构信息

Multifactorial and Complex Diseases Research Area, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy.

Vaccine Research Department, FISABIO-Public Health, Valencia, Spain.

出版信息

Front Public Health. 2022 Jul 29;10:948880. doi: 10.3389/fpubh.2022.948880. eCollection 2022.

DOI:10.3389/fpubh.2022.948880
PMID:35968436
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9372360/
Abstract

Social media is increasingly being used to express opinions and attitudes toward vaccines. The vaccine stance of social media posts can be classified in almost real-time using machine learning. We describe the use of a Transformer-based machine learning model for analyzing vaccine stance of Italian tweets, and demonstrate the need to address changes over time in vaccine-related language, through periodic model retraining. Vaccine-related tweets were collected through a platform developed for the European Joint Action on Vaccination. Two datasets were collected, the first between November 2019 and June 2020, the second from April to September 2021. The tweets were manually categorized by three independent annotators. After cleaning, the total dataset consisted of 1,736 tweets with 3 categories (promotional, neutral, and discouraging). The manually classified tweets were used to train and test various machine learning models. The model that classified the data most similarly to humans was XLM-Roberta-large, a multilingual version of the Transformer-based model RoBERTa. The model hyper-parameters were tuned and then the model ran five times. The fine-tuned model with the best F-score over the validation dataset was selected. Running the selected fine-tuned model on just the first test dataset resulted in an accuracy of 72.8% (F-score 0.713). Using this model on the second test dataset resulted in a 10% drop in accuracy to 62.1% (F-score 0.617), indicating that the model recognized a difference in language between the datasets. On the combined test datasets the accuracy was 70.1% (F-score 0.689). Retraining the model using data from the first and second datasets increased the accuracy over the second test dataset to 71.3% (F-score 0.713), a 9% improvement from when using just the first dataset for training. The accuracy over the first test dataset remained the same at 72.8% (F-score 0.721). The accuracy over the combined test datasets was then 72.4% (F-score 0.720), a 2% improvement. Through fine-tuning a machine-learning model on task-specific data, the accuracy achieved in categorizing tweets was close to that expected by a single human annotator. Regular training of machine-learning models with recent data is advisable to maximize accuracy.

摘要

社交媒体越来越多地被用于表达对疫苗的看法和态度。使用机器学习可以近乎实时地对社交媒体帖子的疫苗立场进行分类。我们描述了一种基于转换器的机器学习模型在分析意大利推文疫苗立场方面的应用,并通过定期模型重新训练证明了有必要解决与疫苗相关的语言随时间的变化。通过为欧洲联合疫苗行动开发的一个平台收集了与疫苗相关的推文。收集了两个数据集,第一个数据集于 2019 年 11 月至 2020 年 6 月之间收集,第二个数据集于 2021 年 4 月至 9 月之间收集。这些推文由三名独立的注释者手动分类。经过清理,总数据集由 1736 条推文组成,分为 3 类(宣传、中立和劝阻)。手动分类的推文用于训练和测试各种机器学习模型。与人类最相似地分类数据的模型是 XLM-Roberta-large,这是一种基于转换器的模型 RoBERTa 的多语言版本。调整了模型超参数,然后模型运行了 5 次。选择在验证数据集上具有最佳 F 分数的微调模型。仅在第一个测试数据集上运行选择的微调模型,其准确率为 72.8%(F 分数 0.713)。在第二个测试数据集上使用此模型导致准确率下降 10%,降至 62.1%(F 分数 0.617),表明模型识别出数据集之间语言的差异。在联合测试数据集中,准确率为 70.1%(F 分数 0.689)。使用来自第一个和第二个数据集的数据重新训练模型,使模型在第二个测试数据集上的准确率提高到 71.3%(F 分数 0.713),比仅使用第一个数据集进行训练提高了 9%。在第一个测试数据集上的准确率保持不变,为 72.8%(F 分数 0.721)。然后,联合测试数据集的准确率为 72.4%(F 分数 0.720),提高了 2%。通过在特定于任务的数据上微调机器学习模型,可以实现接近单个人类注释者的分类推文的准确性。建议定期用最新数据训练机器学习模型,以最大限度地提高准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c07/9372360/219f5e437729/fpubh-10-948880-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c07/9372360/e6c1885636d1/fpubh-10-948880-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c07/9372360/219f5e437729/fpubh-10-948880-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c07/9372360/e6c1885636d1/fpubh-10-948880-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c07/9372360/219f5e437729/fpubh-10-948880-g0002.jpg

相似文献

1
Understanding the vaccine stance of Italian tweets and addressing language changes through the COVID-19 pandemic: Development and validation of a machine learning model.理解意大利语推文中的疫苗立场,并通过 COVID-19 大流行解决语言变化问题:机器学习模型的开发和验证。
Front Public Health. 2022 Jul 29;10:948880. doi: 10.3389/fpubh.2022.948880. eCollection 2022.
2
Categorizing Vaccine Confidence With a Transformer-Based Machine Learning Model: Analysis of Nuances of Vaccine Sentiment in Twitter Discourse.使用基于Transformer的机器学习模型对疫苗信心进行分类:推特话语中疫苗情绪细微差别分析
JMIR Med Inform. 2021 Oct 8;9(10):e29584. doi: 10.2196/29584.
3
How the Italian Twitter Conversation on Vaccines Changed During the First Phase of the Pandemic: A Mixed-Method Analysis.意大利推特疫苗对话在大流行第一阶段如何变化:混合方法分析。
Front Public Health. 2022 May 18;10:824465. doi: 10.3389/fpubh.2022.824465. eCollection 2022.
4
Dynamics of the Negative Discourse Toward COVID-19 Vaccines: Topic Modeling Study and an Annotated Data Set of Twitter Posts.针对 COVID-19 疫苗的负面话语动态:主题建模研究与 Twitter 帖子的标注数据集。
J Med Internet Res. 2023 Apr 12;25:e41319. doi: 10.2196/41319.
5
An Analysis of French-Language Tweets About COVID-19 Vaccines: Supervised Learning Approach.关于新冠疫苗的法语推文分析:监督学习方法
JMIR Med Inform. 2022 May 17;10(5):e37831. doi: 10.2196/37831.
6
Leveraging Transfer Learning to Analyze Opinions, Attitudes, and Behavioral Intentions Toward COVID-19 Vaccines: Social Media Content and Temporal Analysis.利用迁移学习分析对 COVID-19 疫苗的意见、态度和行为意向:社交媒体内容和时间分析。
J Med Internet Res. 2021 Aug 10;23(8):e30251. doi: 10.2196/30251.
7
Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic.应用机器学习识别 COVID-19 大流行期间的反疫苗推文。
Int J Environ Res Public Health. 2021 Apr 12;18(8):4069. doi: 10.3390/ijerph18084069.
8
Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study.用于在阿拉伯社交媒体上检测准确健康信息的预训练Transformer语言模型与预训练词嵌入:比较研究
JMIR Form Res. 2022 Jun 29;6(6):e34834. doi: 10.2196/34834.
9
Automatically detecting and understanding the perception of COVID-19 vaccination: a middle east case study.自动检测并理解对新冠疫苗接种的认知:一项中东案例研究。
Soc Netw Anal Min. 2022;12(1):128. doi: 10.1007/s13278-022-00946-0. Epub 2022 Sep 4.
10
Artificial Intelligence-Based Models for Predicting Vaccines Critical Tweets: An Experimental Study.基于人工智能的疫苗关键推文预测模型:一项实验研究。
Stud Health Technol Inform. 2022 Jun 29;295:209-212. doi: 10.3233/SHTI220699.

引用本文的文献

1
Lexicon-based sentiment analysis to detect opinions and attitude towards COVID-19 vaccines on Twitter in Italy.基于词典的情感分析来检测意大利推特上对 COVID-19 疫苗的意见和态度。
Comput Biol Med. 2023 May;158:106876. doi: 10.1016/j.compbiomed.2023.106876. Epub 2023 Apr 5.

本文引用的文献

1
Modelling the impact of vaccine hesitancy in prolonging the need for Non-Pharmaceutical Interventions to control the COVID-19 pandemic.模拟疫苗犹豫对延长控制新冠疫情所需非药物干预措施时长的影响。
Commun Med (Lond). 2022 Feb 10;2:14. doi: 10.1038/s43856-022-00075-x. eCollection 2022.
2
Tweet Analysis for Enhancement of COVID-19 Epidemic Simulation: A Case Study in Japan.推文分析在 COVID-19 疫情模拟中的增强作用:以日本为例的一项研究
Front Public Health. 2022 Mar 31;10:806813. doi: 10.3389/fpubh.2022.806813. eCollection 2022.
3
Effectiveness of mRNA Vaccination in Preventing COVID-19-Associated Invasive Mechanical Ventilation and Death - United States, March 2021-January 2022.
mRNA 疫苗接种在预防 COVID-19 相关有创机械通气和死亡方面的有效性-美国,2021 年 3 月-2022 年 1 月。
MMWR Morb Mortal Wkly Rep. 2022 Mar 25;71(12):459-465. doi: 10.15585/mmwr.mm7112e1.
4
Association Between mRNA Vaccination and COVID-19 Hospitalization and Disease Severity.mRNA 疫苗接种与 COVID-19 住院和疾病严重程度的关联。
JAMA. 2021 Nov 23;326(20):2043-2054. doi: 10.1001/jama.2021.19499.
5
COVID-19 Vaccine Hesitancy in the Month Following the Start of the Vaccination Process.新冠病毒疫苗接种启动后一个月的犹豫情况。
Int J Environ Res Public Health. 2021 Oct 4;18(19):10438. doi: 10.3390/ijerph181910438.
6
Categorizing Vaccine Confidence With a Transformer-Based Machine Learning Model: Analysis of Nuances of Vaccine Sentiment in Twitter Discourse.使用基于Transformer的机器学习模型对疫苗信心进行分类:推特话语中疫苗情绪细微差别分析
JMIR Med Inform. 2021 Oct 8;9(10):e29584. doi: 10.2196/29584.
7
COVID-19 vaccine acceptance and hesitancy in low- and middle-income countries.新冠病毒疫苗在中低收入国家的接受程度和犹豫。
Nat Med. 2021 Aug;27(8):1385-1394. doi: 10.1038/s41591-021-01454-y. Epub 2021 Jul 16.
8
Considering the possibilities and pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in healthcare delivery.考虑生成式预训练变换器3(GPT-3)在医疗服务中的可能性和潜在问题。
NPJ Digit Med. 2021 Jun 3;4(1):93. doi: 10.1038/s41746-021-00464-x.
9
Assessing COVID-19 Vaccine Hesitancy, Confidence, and Public Engagement: A Global Social Listening Study.评估 COVID-19 疫苗犹豫、信心和公众参与:一项全球社会聆听研究。
J Med Internet Res. 2021 Jun 11;23(6):e27632. doi: 10.2196/27632.
10
COVID-19 vaccine rumors and conspiracy theories: The need for cognitive inoculation against misinformation to improve vaccine adherence.COVID-19 疫苗谣言和阴谋论:需要进行认知免疫接种以抵制错误信息,提高疫苗接种率。
PLoS One. 2021 May 12;16(5):e0251605. doi: 10.1371/journal.pone.0251605. eCollection 2021.