文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

微调 BERT 模型以在 Twitter 上对大蒜和 COVID-19 相关的错误信息进行分类。

Fine-Tuning BERT Models to Classify Misinformation on Garlic and COVID-19 on Twitter.

机构信息

College of Pharmacy, Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.

College of Pharmacy, Yonsei University, Incheon 21983, Korea.

出版信息

Int J Environ Res Public Health. 2022 Apr 22;19(9):5126. doi: 10.3390/ijerph19095126.


DOI:10.3390/ijerph19095126
PMID:35564518
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9103576/
Abstract

Garlic-related misinformation is prevalent whenever a virus outbreak occurs. With the outbreak of COVID-19, garlic-related misinformation is spreading through social media, including Twitter. Bidirectional Encoder Representations from Transformers (BERT) can be used to classify misinformation from a vast number of tweets. This study aimed to apply the BERT model for classifying misinformation on garlic and COVID-19 on Twitter, using 5929 original tweets mentioning garlic and COVID-19 (4151 for fine-tuning, 1778 for test). Tweets were manually labeled as 'misinformation' and 'other.' We fine-tuned five BERT models (BERT, BERT, BERTweet-base, BERTweet-COVID-19, and BERTweet-large) using a general COVID-19 rumor dataset or a garlic-specific dataset. Accuracy and F1 score were calculated to evaluate the performance of the models. The BERT models fine-tuned with the COVID-19 rumor dataset showed poor performance, with maximum accuracy of 0.647. BERT models fine-tuned with the garlic-specific dataset showed better performance. BERTweet models achieved accuracy of 0.897-0.911, while BERT and BERT achieved accuracy of 0.887-0.897. BERTweet-large showed the best performance with maximum accuracy of 0.911 and an F1 score of 0.894. Thus, BERT models showed good performance in classifying misinformation. The results of our study will help detect misinformation related to garlic and COVID-19 on Twitter.

摘要

每当病毒爆发时,就会出现与大蒜相关的错误信息。随着 COVID-19 的爆发,有关大蒜的错误信息通过社交媒体(包括 Twitter)传播。双向编码器表示转换器(BERT)可用于对大量推文进行分类。本研究旨在应用 BERT 模型对 Twitter 上有关大蒜和 COVID-19 的错误信息进行分类,使用了 5929 条提及大蒜和 COVID-19 的原始推文(4151 条用于微调,1778 条用于测试)。推文被手动标记为“错误信息”和“其他”。我们使用一般的 COVID-19 谣言数据集或大蒜专用数据集对五个 BERT 模型(BERT、BERT、BERTweet-base、BERTweet-COVID-19 和 BERTweet-large)进行了微调。我们计算了准确性和 F1 分数来评估模型的性能。使用 COVID-19 谣言数据集进行微调的 BERT 模型表现不佳,最高准确性为 0.647。使用大蒜专用数据集进行微调的 BERT 模型表现更好。BERTweet 模型的准确性达到 0.897-0.911,而 BERT 和 BERT 的准确性达到 0.887-0.897。BERTweet-large 的表现最佳,准确性最高为 0.911,F1 分数为 0.894。因此,BERT 模型在分类错误信息方面表现良好。我们的研究结果将有助于检测 Twitter 上有关大蒜和 COVID-19 的错误信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/8dba364c8ffc/ijerph-19-05126-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/c078e73ac7fa/ijerph-19-05126-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/970e3bc26416/ijerph-19-05126-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/7f5f0924cb9b/ijerph-19-05126-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/8dba364c8ffc/ijerph-19-05126-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/c078e73ac7fa/ijerph-19-05126-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/970e3bc26416/ijerph-19-05126-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/7f5f0924cb9b/ijerph-19-05126-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ded0/9103576/8dba364c8ffc/ijerph-19-05126-g004.jpg

相似文献

[1]
Fine-Tuning BERT Models to Classify Misinformation on Garlic and COVID-19 on Twitter.

Int J Environ Res Public Health. 2022-4-22

[2]
Comparison of pretrained transformer-based models for influenza and COVID-19 detection using social media text data in Saskatchewan, Canada.

Front Digit Health. 2023-6-28

[3]
Identifying Potential Lyme Disease Cases Using Self-Reported Worldwide Tweets: Deep Learning Modeling Approach Enhanced With Sentimental Words Through Emojis.

J Med Internet Res. 2023-10-16

[4]
ANTi-Vax: a novel Twitter dataset for COVID-19 vaccine misinformation detection.

Public Health. 2022-2

[5]
Towards COVID-19 fake news detection using transformer-based models.

Knowl Based Syst. 2023-8-15

[6]
Misinformation and Public Health Messaging in the Early Stages of the Mpox Outbreak: Mapping the Twitter Narrative With Deep Learning.

J Med Internet Res. 2023-6-6

[7]
Social Media Monitoring of the COVID-19 Pandemic and Influenza Epidemic With Adaptation for Informal Language in Arabic Twitter Data: Qualitative Study.

JMIR Med Inform. 2021-9-17

[8]
COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets.

Appl Soft Comput. 2021-8

[9]
Categorization of tweets for damages: infrastructure and human damage assessment using fine-tuned BERT model.

PeerJ Comput Sci. 2024-2-16

[10]
Toward Using Twitter for Tracking COVID-19: A Natural Language Processing Pipeline and Exploratory Data Set.

J Med Internet Res. 2021-1-22

引用本文的文献

[1]
Mapping automatic social media information disorder. The role of bots and AI in spreading misleading information in society.

PLoS One. 2024

[2]
Integration of the Natural Language Processing of Structural Information Simplified Molecular-Input Line-Entry System Can Improve the In Vitro Prediction of Human Skin Sensitizers.

Toxics. 2024-2-16

[3]
The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection.

JMIR Infodemiology. 2023-3-14

本文引用的文献

[1]
Combating the infodemic: COVID-19 induced fake news recognition in social media networks.

Complex Intell Systems. 2023

[2]
A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification.

J Healthc Eng. 2022

[3]
The Impact of Media on Public Health Awareness Concerning the Use of Natural Remedies Against the COVID-19 Outbreak in Saudi Arabia.

Int J Gen Med. 2021-7-2

[4]
Fighting the 'Infodemic': Legal Responses to COVID-19 Disinformation.

Soc Media Soc. 2020-7-30

[5]
The Use of Social Media in Detecting Drug Safety-Related New Black Box Warnings, Labeling Changes, or Withdrawals: Scoping Review.

JMIR Public Health Surveill. 2021-6-28

[6]
A COVID-19 Rumor Dataset.

Front Psychol. 2021-5-31

[7]
Combat COVID-19 infodemic using explainable natural language processing models.

Inf Process Manag. 2021-7

[8]
Public Knowledge, Attitudes, and Practices Related to COVID-19 in Iran: Questionnaire Study.

JMIR Public Health Surveill. 2021-2-23

[9]
The role of social media in spreading panic among primary and secondary school students during the COVID-19 pandemic: An online questionnaire study from the Gaza Strip, Palestine.

Heliyon. 2020-12-21

[10]
Myth Busters: Dietary Supplements and COVID-19.

Ann Pharmacother. 2020-5-12

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索