Lu Jiahui, Zhang Huibin, Xiao Yi, Wang Yingyu
State Key Laboratory of Communication Content Cognition, People's Daily Online, Beijing, China.
School of New Media and Communication, Tianjin University, Tianjin, China.
JMIR AI. 2024 Jan 29;3:e47240. doi: 10.2196/47240.
Amidst the COVID-19 pandemic, misinformation on social media has posed significant threats to public health. Detecting and predicting the spread of misinformation are crucial for mitigating its adverse effects. However, prevailing frameworks for these tasks have predominantly focused on post-level signals of misinformation, neglecting features of the broader information environment where misinformation originates and proliferates.
This study aims to create a novel framework that integrates the uncertainty of the information environment into misinformation features, with the goal of enhancing the model's accuracy in tasks such as misinformation detection and predicting the scale of dissemination. The objective is to provide better support for online governance efforts during health crises.
In this study, we embraced uncertainty features within the information environment and introduced a novel Environmental Uncertainty Perception (EUP) framework for the detection of misinformation and the prediction of its spread on social media. The framework encompasses uncertainty at 4 scales of the information environment: physical environment, macro-media environment, micro-communicative environment, and message framing. We assessed the effectiveness of the EUP using real-world COVID-19 misinformation data sets.
The experimental results demonstrated that the EUP alone achieved notably good performance, with detection accuracy at 0.753 and prediction accuracy at 0.71. These results were comparable to state-of-the-art baseline models such as bidirectional long short-term memory (BiLSTM; detection accuracy 0.733 and prediction accuracy 0.707) and bidirectional encoder representations from transformers (BERT; detection accuracy 0.755 and prediction accuracy 0.728). Additionally, when the baseline models collaborated with the EUP, they exhibited improved accuracy by an average of 1.98% for the misinformation detection and 2.4% for spread-prediction tasks. On unbalanced data sets, the EUP yielded relative improvements of 21.5% and 5.7% in macro-F1-score and area under the curve, respectively.
This study makes a significant contribution to the literature by recognizing uncertainty features within information environments as a crucial factor for improving misinformation detection and spread-prediction algorithms during the pandemic. The research elaborates on the complexities of uncertain information environments for misinformation across 4 distinct scales, including the physical environment, macro-media environment, micro-communicative environment, and message framing. The findings underscore the effectiveness of incorporating uncertainty into misinformation detection and spread prediction, providing an interdisciplinary and easily implementable framework for the field.
在新冠疫情期间,社交媒体上的错误信息对公众健康构成了重大威胁。检测和预测错误信息的传播对于减轻其不利影响至关重要。然而,当前用于这些任务的框架主要侧重于错误信息的帖子级信号,而忽略了错误信息产生和传播的更广泛信息环境的特征。
本研究旨在创建一个新颖的框架,将信息环境的不确定性整合到错误信息特征中,以提高模型在错误信息检测和预测传播规模等任务中的准确性。目的是在健康危机期间为在线治理工作提供更好的支持。
在本研究中,我们纳入了信息环境中的不确定性特征,并引入了一种新颖的环境不确定性感知(EUP)框架,用于检测社交媒体上的错误信息及其传播预测。该框架涵盖了信息环境4个尺度的不确定性:物理环境、宏观媒体环境、微观交流环境和信息框架。我们使用真实世界的新冠疫情错误信息数据集评估了EUP的有效性。
实验结果表明,仅EUP就取得了显著良好的性能,检测准确率为0.753,预测准确率为0.71。这些结果与最先进的基线模型相当,如双向长短期记忆(BiLSTM;检测准确率0.733,预测准确率0.707)和来自变换器的双向编码器表示(BERT;检测准确率0.755,预测准确率0.728)。此外,当基线模型与EUP协作时,它们在错误信息检测任务中的准确率平均提高了1.98%,在传播预测任务中的准确率平均提高了2.4%。在不平衡数据集上,EUP在宏观F1分数和曲线下面积方面分别产生了21.5%和5.7%的相对提升。
本研究通过将信息环境中的不确定性特征识别为在疫情期间改进错误信息检测和传播预测算法的关键因素,为文献做出了重大贡献。该研究阐述了错误信息在4个不同尺度上的不确定信息环境的复杂性,包括物理环境、宏观媒体环境、微观交流环境和信息框架。研究结果强调了将不确定性纳入错误信息检测和传播预测的有效性,为该领域提供了一个跨学科且易于实施的框架。