Suppr超能文献

推特数据采样如何使对美国选民行为的描述产生偏差。

How Twitter data sampling biases U.S. voter behavior characterizations.

作者信息

Yang Kai-Cheng, Hui Pik-Mai, Menczer Filippo

机构信息

Observatory on Social Media, Indiana University, Bloomington, Indiana, United States.

出版信息

PeerJ Comput Sci. 2022 Jul 1;8:e1025. doi: 10.7717/peerj-cs.1025. eCollection 2022.

Abstract

Online social media are key platforms for the public to discuss political issues. As a result, researchers have used data from these platforms to analyze public opinions and forecast election results. The literature has shown that due to inauthentic actors such as malicious social bots and trolls, not every message is a genuine expression from a legitimate user. However, the prevalence of inauthentic activities in social data streams is still unclear, making it difficult to gauge biases of analyses based on such data. In this article, we aim to close this gap using Twitter data from the 2018 U.S. midterm elections. We propose an efficient and low-cost method to identify voters on Twitter and systematically compare their behaviors with different random samples of accounts. We find that some accounts flood the public data stream with political content, drowning the voice of the majority of voters. As a result, these hyperactive accounts are over-represented in volume samples. Hyperactive accounts are more likely to exhibit various suspicious behaviors and to share low-credibility information compared to likely voters. Our work provides insights into biased voter characterizations when using social media data to analyze political issues.

摘要

在线社交媒体是公众讨论政治问题的关键平台。因此,研究人员利用这些平台的数据来分析公众舆论并预测选举结果。文献表明,由于存在恶意社交机器人和网络喷子等虚假行为主体,并非每条信息都是合法用户的真实表达。然而,社交数据流中虚假活动的普遍程度仍不明确,这使得难以衡量基于此类数据的分析偏差。在本文中,我们旨在利用2018年美国中期选举的推特数据来填补这一空白。我们提出一种高效且低成本的方法来识别推特上的选民,并系统地将他们的行为与不同的随机账户样本进行比较。我们发现,一些账户用政治内容充斥公共数据流,淹没了大多数选民的声音。结果,这些活跃过度的账户在数量样本中占比过高。与可能的选民相比,活跃过度的账户更有可能表现出各种可疑行为并分享可信度低的信息。我们的工作为使用社交媒体数据分析政治问题时存在偏差的选民特征刻画提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6920/9299280/e09cb82af8d4/peerj-cs-08-1025-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验