Suppr超能文献

《今夜我在蓝调天空下》:来自一年社交数据的洞察。

"I'm in the Bluesky Tonight": Insights from a year worth of social data.

机构信息

Department of Computer Science, University of Pisa, Pisa, Italy.

National Research Council, Institute of Information Science and Technologies "A. Faedo" (ISTI), Pisa, Italy.

出版信息

PLoS One. 2024 Nov 5;19(11):e0310330. doi: 10.1371/journal.pone.0310330. eCollection 2024.

Abstract

Pollution of online social spaces caused by rampaging d/misinformation is a growing societal concern. However, recent decisions to reduce access to social media APIs are causing a shortage of publicly available, recent, social media data, thus hindering the advancement of computational social science as a whole. We present a large, high-coverage dataset of social interactions and user-generated content from Bluesky Social to address this pressing issue. The dataset contains the complete post history of over 4M users (81% of all registered accounts), totalling 235M posts. We also make available social data covering follow, comment, repost, and quote interactions. Since Bluesky allows users to create and like feed generators (i.e., content recommendation algorithms), we also release the full output of several popular algorithms available on the platform, along with their timestamped "like" interactions. This dataset allows novel analysis of online behavior and human-machine engagement patterns. Notably, it provides ground-truth data for studying the effects of content exposure and self-selection and performing content virality and diffusion analysis.

摘要

社交媒体空间中肆虐的虚假信息污染是一个日益严重的社会问题。然而,最近限制社交媒体 API 访问的决定导致可公开获取的、最近的社交媒体数据短缺,从而阻碍了整个计算社会科学的发展。我们提出了一个来自 Bluesky Social 的大型、高覆盖率的社交互动和用户生成内容数据集,以解决这个紧迫的问题。该数据集包含超过 400 万用户(所有注册账户的 81%)的完整帖子历史记录,总计 2.35 亿条帖子。我们还提供了涵盖关注、评论、转发和引用互动的社交数据。由于 Bluesky 允许用户创建和喜欢 feed 生成器(即内容推荐算法),我们还发布了平台上几个流行算法的完整输出,以及它们带有时间戳的“喜欢”互动。这个数据集允许对在线行为和人机交互模式进行新的分析。值得注意的是,它为研究内容曝光和自我选择的影响、进行内容传播和扩散分析提供了真实数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f1/11537377/065601aea3ff/pone.0310330.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验