Argyris Young Anna, Zhang Nan, Bashyal Bidhan, Tan Pang-Ning
Dept of Media and Information, Michigan State University, East Lansing, MI.
Dept of Advertising and Public Relations, Michigan State University, East Lansing, MI.
2022 IEEE Int Conf Digit Health IEEE IDCH 2022 (2022). 2022 Jul;2022:107-116. doi: 10.1109/icdh55609.2022.00025. Epub 2022 Aug 24.
Anti-vaccine content is rapidly propagated via social media, fostering vaccine hesitancy, while pro-vaccine content has not replicated the opponent's successes. Despite this disparity in the dissemination of anti- and pro-vaccine posts, linguistic features that facilitate or inhibit the propagation of vaccine-related content remain less known. Moreover, most prior machine-learning algorithms classified social-media posts into binary categories (e.g., misinformation or not) and have rarely tackled a higher-order classification task based on divergent perspectives about vaccines (e.g., anti-vaccine, pro-vaccine, and neutral). Our objectives are (1) to identify sets of linguistic features that facilitate and inhibit the propagation of vaccine-related content and (2) to compare whether anti-vaccine, provaccine, and neutral tweets contain either set more frequently than the others. To achieve these goals, we collected a large set of social media posts (over 120 million tweets) between Nov. 15 and Dec. 15, 2021, coinciding with the Omicron variant surge. A two-stage framework was developed using a fine-tuned BERT classifier, demonstrating over 99 and 80 percent accuracy for binary and ternary classification. Finally, the Linguistic Inquiry Word Count text analysis tool was used to count linguistic features in each classified tweet. Our regression results show that anti-vaccine tweets are propagated (i.e., retweeted), while pro-vaccine tweets garner passive endorsements (i.e., favorited). Our results also yielded the two sets of linguistic features as facilitators and inhibitors of the propagation of vaccine-related tweets. Finally, our regression results show that anti-vaccine tweets tend to use the facilitators, while pro-vaccine counterparts employ the inhibitors. These findings and algorithms from this study will aid public health officials' efforts to counteract vaccine misinformation, thereby facilitating the delivery of preventive measures during pandemics and epidemics.
反疫苗内容通过社交媒体迅速传播,加剧了人们对疫苗的犹豫态度,而支持疫苗的内容却未能取得与反对者同样的传播成效。尽管反疫苗和支持疫苗的帖子在传播方面存在这种差异,但促进或抑制疫苗相关内容传播的语言特征仍鲜为人知。此外,大多数先前的机器学习算法将社交媒体帖子分为二元类别(例如,错误信息或非错误信息),很少处理基于对疫苗的不同观点(例如,反疫苗、支持疫苗和中立)的高阶分类任务。我们的目标是:(1)识别促进和抑制疫苗相关内容传播的语言特征集;(2)比较反疫苗、支持疫苗和中立的推文是否比其他推文更频繁地包含其中任何一组特征。为实现这些目标,我们收集了2021年11月15日至12月15日期间大量的社交媒体帖子(超过1.2亿条推文),这一时期恰逢奥密克戎变种激增。我们使用微调后的BERT分类器开发了一个两阶段框架,二元和三元分类的准确率分别超过99%和80%。最后,使用语言查询词频文本分析工具对每条分类后的推文的语言特征进行计数。我们的回归结果表明,反疫苗推文会被传播(即被转发),而支持疫苗的推文获得的是被动认可(即被点赞)。我们的结果还得出了两组作为疫苗相关推文传播促进因素和抑制因素的语言特征。最后,我们的回归结果表明,反疫苗推文倾向于使用促进因素,而支持疫苗的推文则使用抑制因素。本研究的这些发现和算法将有助于公共卫生官员努力对抗疫苗错误信息,从而在大流行和疫情期间促进预防措施的实施。