College of Engineering, Computer Science and Information Technology Department, Abu Dhabi University, Abu Dhabi, United Arab Emirates.
College of Engineering, Electrical, Computer, and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi, United Arab Emirates.
BMC Public Health. 2023 Jun 21;23(1):1193. doi: 10.1186/s12889-023-16067-y.
The spread of misinformation of all types threatens people's safety and interrupts resolutions. COVID-19 vaccination has been a widely discussed topic on social media platforms with numerous misleading and fallacious information. This false information has a critical impact on the safety of society as it prevents many people from taking the vaccine, decelerating the world's ability to go back to normal. Therefore, it is vital to analyze the content shared on social media platforms, detect misinformation, identify aspects of misinformation, and efficiently represent related statistics to combat the spread of misleading information about the vaccine. This paper aims to support stakeholders in decision-making by providing solid and current insights into the spatiotemporal progression of the common misinformation aspects of the various available vaccines.
Approximately 3800 tweets were annotated into four expert-verified aspects of vaccine misinformation obtained from reliable medical resources. Next, an Aspect-based Misinformation Analysis Framework was designed using the Light Gradient Boosting Machine (LightGBM) model, which is one of the most advanced, fast, and efficient machine learning models to date. Based on this dataset, spatiotemporal statistical analysis was performed to infer insights into the progression of aspects of vaccine misinformation among the public. Finally, the Pearson correlation coefficient and p-values are calculated for the global misinformation count against the vaccination counts of 43 countries from December 2020 until July 2021.
The optimized classification per class (i.e., per an aspect of misinformation) accuracy was 87.4%, 92.7%, 80.1%, and 82.5% for the "Vaccine Constituent," "Adverse Effects," "Agenda," "Efficacy and Clinical Trials" aspects, respectively. The model achieved an Area Under the ROC Curve (AUC) of 90.3% and 89.6% for validation and testing, respectively, which indicates the reliability of the proposed framework in detecting aspects of vaccine misinformation on Twitter. The correlation analysis shows that 37% of the countries addressed in this study were negatively affected by the spread of misinformation on Twitter resulting in reduced number of administered vaccines during the same timeframe.
Twitter is a rich source of insight on the progression of vaccine misinformation among the public. Machine Learning models, such as LightGBM, are efficient for multi-class classification and proved reliable in classifying vaccine misinformation aspects even with limited samples in social media datasets.
各种类型的错误信息的传播威胁着人们的安全并阻碍了解决问题的进程。新冠疫苗在社交媒体平台上是一个被广泛讨论的话题,其中充斥着大量的误导性和错误信息。这些虚假信息对社会安全产生了重大影响,因为它阻止了许多人接种疫苗,减缓了世界恢复正常的速度。因此,分析社交媒体平台上分享的内容、检测错误信息、识别错误信息的各个方面,并有效地表示相关统计数据,以打击有关疫苗的误导性信息的传播,这一点至关重要。本文旨在通过提供关于各种可用疫苗的常见错误信息方面的可靠和当前的深入见解,为利益相关者的决策提供支持。
从可靠的医疗资源中获取四个经过专家验证的疫苗错误信息方面,对大约 3800 条推文进行了标注。然后,使用 Light Gradient Boosting Machine (LightGBM) 模型设计了一个基于方面的错误信息分析框架,LightGBM 是迄今为止最先进、最快、最有效的机器学习模型之一。基于这个数据集,对空间和时间进行了统计分析,以推断公众对疫苗错误信息方面的发展趋势。最后,计算了从 2020 年 12 月到 2021 年 7 月,43 个国家的全球错误信息计数与疫苗接种计数之间的皮尔逊相关系数和 p 值。
针对每个类别的分类准确率(即针对错误信息的一个方面)分别为 87.4%、92.7%、80.1%和 82.5%,用于“疫苗成分”、“不良反应”、“议程”和“疗效和临床试验”方面。该模型在验证和测试中的 AUC 分别为 90.3%和 89.6%,这表明了该框架在检测 Twitter 上的疫苗错误信息方面的可靠性。相关性分析表明,在本研究中涉及的 37%的国家受到了 Twitter 上错误信息传播的负面影响,导致同一时期接种疫苗的人数减少。
Twitter 是了解公众对疫苗错误信息发展趋势的丰富信息来源。机器学习模型(如 LightGBM)非常适合多类分类,并且即使在社交媒体数据集的样本有限的情况下,也被证明在分类疫苗错误信息方面非常可靠。