Suppr超能文献

来自Meta和谷歌的2022年可比大选广告数据集。

Comparable 2022 General Election Advertising Datasets from Meta and Google.

作者信息

Zhang Meiqing, Cakmak Furkan, Neumann Markus, Zimmeck Sebastian, Oleinikov Pavel, Yao Jielu, Yu Harry, Jacewicz Aleks, Tassone Isabella, Floyd Breeze, Baum Laura, Franz Michael M, Ridout Travis N, Fowler Erika Franklin

机构信息

Wesleyan University, Wesleyan Media Project, Middletown, 06459, USA.

Duke Kunshan University, Division of Social Sciences, Kunshan, 215316, China.

出版信息

Sci Data. 2025 Jun 9;12(1):968. doi: 10.1038/s41597-025-05228-w.

Abstract

This paper introduces two comprehensive datasets containing information on digital ads in U.S. federal elections from Meta (including Facebook and Instagram) and Google (including YouTube) for the 2022 midterm general election period. We collected ads published on these platforms utilizing their ad transparency libraries and web scraping techniques and added labels to make them more comparable. The collected data underwent processing to extract audiovisual and textual information through automatic speech recognition (ASR), face recognition, and optical character recognition (OCR). Additionally, we performed several classification tasks to enhance the utility of the dataset. The resulting datasets encompass a rich array of features, including metadata, transcripts, and classifications. These datasets provide valuable resources for researchers, policymakers, and journalists to analyze the digital election advertising landscape, campaign strategies, and public engagement. By offering detailed and structured data, our work facilitates diverse reuse possibilities in fields such as political science, communication studies, and data science, enabling comprehensive analysis and insights into the dynamics of digital political campaigns.

摘要

本文介绍了两个综合数据集,其中包含来自Meta(包括Facebook和Instagram)和谷歌(包括YouTube)的2022年中期大选期间美国联邦选举数字广告信息。我们利用这些平台的广告透明度库和网络爬虫技术收集了在这些平台上发布的广告,并添加了标签以使它们更具可比性。收集到的数据经过处理,通过自动语音识别(ASR)、人脸识别和光学字符识别(OCR)提取视听和文本信息。此外,我们还执行了几项分类任务以提高数据集的实用性。所得数据集包含丰富的特征,包括元数据、文字记录和分类。这些数据集为研究人员、政策制定者和记者分析数字选举广告格局、竞选策略和公众参与度提供了宝贵资源。通过提供详细且结构化的数据,我们的工作促进了政治学、传播学和数据科学等领域的多种重用可能性,能够对数字政治竞选动态进行全面分析并获得深入见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4ad/12149306/9f751849445b/41597_2025_5228_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验