Shah Zubair, Surian Didi, Dyda Amalie, Coiera Enrico, Mandl Kenneth D, Dunn Adam G
Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
Division of Information and Communication Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
J Med Internet Res. 2019 Nov 4;21(11):e14007. doi: 10.2196/14007.
Tools used to appraise the credibility of health information are time-consuming to apply and require context-specific expertise, limiting their use for quickly identifying and mitigating the spread of misinformation as it emerges.
The aim of this study was to estimate the proportion of vaccine-related Twitter posts linked to Web pages of low credibility and measure the potential reach of those posts.
Sampling from 143,003 unique vaccine-related Web pages shared on Twitter between January 2017 and March 2018, we used a 7-point checklist adapted from validated tools and guidelines to manually appraise the credibility of 474 Web pages. These were used to train several classifiers (random forests, support vector machines, and recurrent neural networks) using the text from a Web page to predict whether the information satisfies each of the 7 criteria. Estimating the credibility of all other Web pages, we used the follower network to estimate potential exposures relative to a credibility score defined by the 7-point checklist.
The best-performing classifiers were able to distinguish between low, medium, and high credibility with an accuracy of 78% and labeled low-credibility Web pages with a precision of over 96%. Across the set of unique Web pages, 11.86% (16,961 of 143,003) were estimated as low credibility and they generated 9.34% (1.64 billion of 17.6 billion) of potential exposures. The 100 most popular links to low credibility Web pages were each potentially seen by an estimated 2 million to 80 million Twitter users globally.
The results indicate that although a small minority of low-credibility Web pages reach a large audience, low-credibility Web pages tend to reach fewer users than other Web pages overall and are more commonly shared within certain subpopulations. An automatic credibility appraisal tool may be useful for finding communities of users at higher risk of exposure to low-credibility vaccine communications.
用于评估健康信息可信度的工具应用起来耗时且需要特定背景的专业知识,这限制了它们在错误信息出现时快速识别和缓解其传播的用途。
本研究的目的是估计与低可信度网页相关的疫苗相关推特帖子的比例,并衡量这些帖子的潜在传播范围。
从2017年1月至2018年3月期间在推特上分享的143,003个独特的疫苗相关网页中进行抽样,我们使用了一个从经过验证的工具和指南改编而来的7分清单,手动评估474个网页的可信度。这些网页被用于训练几个分类器(随机森林、支持向量机和循环神经网络),利用网页文本预测信息是否满足7项标准中的每一项。在估计所有其他网页的可信度时,我们使用关注者网络来估计相对于由7分清单定义的可信度得分的潜在曝光量。
表现最佳的分类器能够以78%的准确率区分低、中、高可信度,并以超过96%的精确率标记低可信度网页。在这组独特的网页中,估计有11.86%(143,003个中的16,961个)为低可信度,它们产生了9.34%(176亿次曝光中的16.4亿次)的潜在曝光量。100个最热门的低可信度网页链接,全球估计每个都有200万至8000万推特用户可能看到。
结果表明,虽然一小部分低可信度网页能触达大量受众,但总体而言,低可信度网页触达的用户往往比其他网页少,且更常在特定亚群体中分享。自动可信度评估工具可能有助于找到接触低可信度疫苗相关信息风险较高的用户群体。