• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LOCO:8800万字的阴谋语料库语言。

LOCO: The 88-million-word language of conspiracy corpus.

作者信息

Miani Alessandro, Hills Thomas, Bangerter Adrian

机构信息

Institute of Work and Organizational Psychology, University of Neuchâtel, Rue Emile-Argand 11, 2000, Neuchâtel, Switzerland.

Department of Psychology, University of Warwick, University Road, Coventry, CV47AL, UK.

出版信息

Behav Res Methods. 2022 Aug;54(4):1794-1817. doi: 10.3758/s13428-021-01698-z. Epub 2021 Oct 25.

DOI:10.3758/s13428-021-01698-z
PMID:34697754
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8545361/
Abstract

The spread of online conspiracy theories represents a serious threat to society. To understand the content of conspiracies, here we present the language of conspiracy (LOCO) corpus. LOCO is an 88-million-token corpus composed of topic-matched conspiracy (N = 23,937) and mainstream (N = 72,806) documents harvested from 150 websites. Mimicking internet user behavior, documents were identified using Google by crossing a set of seed phrases with a set of websites. LOCO is hierarchically structured, meaning that each document is cross-nested within websites (N = 150) and topics (N = 600, on three different resolutions). A rich set of linguistic features (N = 287) and metadata includes upload date, measures of social media engagement, measures of website popularity, size, and traffic, as well as political bias and factual reporting annotations. We explored LOCO's features from different perspectives showing that documents track important societal events through time (e.g., Princess Diana's death, Sandy Hook school shooting, coronavirus outbreaks), while patterns of lexical features (e.g., deception, power, dominance) overlap with those extracted from online social media communities dedicated to conspiracy theories. By computing within-subcorpus cosine similarity, we derived a subset of the most representative conspiracy documents (N = 4,227), which, compared to other conspiracy documents, display prototypical and exaggerated conspiratorial language and are more frequently shared on Facebook. We also show that conspiracy website users navigate to websites via more direct means than mainstream users, suggesting confirmation bias. LOCO and related datasets are freely available at https://osf.io/snpcg/ .

摘要

网络阴谋论的传播对社会构成了严重威胁。为了理解阴谋论的内容,我们在此展示阴谋论语言(LOCO)语料库。LOCO是一个包含8800万个词元的语料库,由从150个网站收集的主题匹配的阴谋论(N = 23,937)和主流(N = 72,806)文档组成。模仿互联网用户行为,通过将一组种子短语与一组网站交叉,使用谷歌识别文档。LOCO是分层结构的,这意味着每个文档在网站(N = 150)和主题(N = 600,三种不同分辨率)中交叉嵌套。一组丰富的语言特征(N = 287)和元数据包括上传日期、社交媒体参与度指标、网站受欢迎程度、规模和流量指标,以及政治倾向和事实报道注释。我们从不同角度探索了LOCO的特征,结果表明文档随时间追踪重要的社会事件(如戴安娜王妃之死、桑迪胡克小学枪击案、新冠疫情爆发),而词汇特征模式(如欺骗、权力、主导地位)与从致力于阴谋论的在线社交媒体社区中提取的模式重叠。通过计算子语料库内的余弦相似度,我们得出了最具代表性的阴谋论文档子集(N = 4,227),与其他阴谋论文档相比,这些文档展示了典型且夸张的阴谋论语言,并且在脸书上更频繁地被分享。我们还表明,阴谋论网站用户比主流用户通过更直接的方式访问网站,这表明存在确认偏差。LOCO及相关数据集可在https://osf.io/snpcg/ 免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/d845ff3bb8f1/13428_2021_1698_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/09482cdc1af0/13428_2021_1698_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/591d823bb610/13428_2021_1698_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/755ab2eac70a/13428_2021_1698_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/83d34d886089/13428_2021_1698_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/d845ff3bb8f1/13428_2021_1698_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/09482cdc1af0/13428_2021_1698_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/591d823bb610/13428_2021_1698_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/755ab2eac70a/13428_2021_1698_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/83d34d886089/13428_2021_1698_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef90/9374614/d845ff3bb8f1/13428_2021_1698_Fig5_HTML.jpg

相似文献

1
LOCO: The 88-million-word language of conspiracy corpus.LOCO:8800万字的阴谋语料库语言。
Behav Res Methods. 2022 Aug;54(4):1794-1817. doi: 10.3758/s13428-021-01698-z. Epub 2021 Oct 25.
2
IRMA: the 335-million-word Italian coRpus for studying MisinformAtion.IRMA:用于研究错误信息的3.35亿字意大利语文本库。
Proc Conf Assoc Comput Linguist Meet. 2023 May;2023:2339-2349. Epub 2023 May 1.
3
COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data.新冠疫情与5G阴谋论:基于推特数据的社交网络分析
J Med Internet Res. 2020 May 6;22(5):e19458. doi: 10.2196/19458.
4
Pathways to conspiracy: The social and linguistic precursors of involvement in Reddit's conspiracy theory forum.阴谋之路:参与 Reddit 阴谋论论坛的社会和语言前兆。
PLoS One. 2019 Nov 18;14(11):e0225098. doi: 10.1371/journal.pone.0225098. eCollection 2019.
5
Conspiracy theories and social media platforms.阴谋论和社交媒体平台。
Curr Opin Psychol. 2022 Oct;47:101407. doi: 10.1016/j.copsyc.2022.101407. Epub 2022 Jun 30.
6
"Thought I'd Share First" and Other Conspiracy Theory Tweets from the COVID-19 Infodemic: Exploratory Study.“我想率先分享”和其他有关 COVID-19 信息疫情的阴谋论推文:探索性研究。
JMIR Public Health Surveill. 2021 Apr 14;7(4):e26527. doi: 10.2196/26527.
7
Propagating and Debunking Conspiracy Theories on Twitter During the 2015-2016 Zika Virus Outbreak.在 2015-2016 年寨卡病毒爆发期间在 Twitter 上传播和揭穿阴谋论。
Cyberpsychol Behav Soc Netw. 2018 Aug;21(8):485-490. doi: 10.1089/cyber.2017.0669. Epub 2018 Jul 18.
8
An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web.自动化的阴谋和阴谋论叙事框架发现管道:桥门事件、披萨门和网络故事讲述。
PLoS One. 2020 Jun 16;15(6):e0233879. doi: 10.1371/journal.pone.0233879. eCollection 2020.
9
Science vs conspiracy: collective narratives in the age of misinformation.科学与阴谋:错误信息时代的集体叙事
PLoS One. 2015 Feb 23;10(2):e0118093. doi: 10.1371/journal.pone.0118093. eCollection 2015.
10
Trend of Narratives in the Age of Misinformation.错误信息时代的叙事趋势。
PLoS One. 2015 Aug 14;10(8):e0134641. doi: 10.1371/journal.pone.0134641. eCollection 2015.

引用本文的文献

1
Leveraging artificial intelligence to identify the psychological factors associated with conspiracy theory beliefs online.利用人工智能识别与网络阴谋论信念相关的心理因素。
Nat Commun. 2024 Aug 29;15(1):7497. doi: 10.1038/s41467-024-51740-9.
2
IRMA: the 335-million-word Italian coRpus for studying MisinformAtion.IRMA:用于研究错误信息的3.35亿字意大利语文本库。
Proc Conf Assoc Comput Linguist Meet. 2023 May;2023:2339-2349. Epub 2023 May 1.
3
A topic models analysis of the news coverage of the Omicron variant in the United Kingdom press.

本文引用的文献

1
How We Do Things With Words: Analyzing Text as Social and Cultural Data.我们如何用词做事:将文本作为社会和文化数据进行分析。
Front Artif Intell. 2020 Aug 25;3:62. doi: 10.3389/frai.2020.00062. eCollection 2020.
2
The echo chamber effect on social media.社交媒体的回音室效应。
Proc Natl Acad Sci U S A. 2021 Mar 2;118(9). doi: 10.1073/pnas.2023301118.
3
A global survey of potential acceptance of a COVID-19 vaccine.一项针对 COVID-19 疫苗潜在接受度的全球调查。
奥密克戎变体在英国新闻报道中的主题模型分析。
BMC Public Health. 2023 Aug 9;23(1):1509. doi: 10.1186/s12889-023-16444-7.
4
Interconnectedness and (in)coherence as a signature of conspiracy worldviews.相互联系与(非)连贯性作为阴谋论世界观的一个特征。
Sci Adv. 2022 Oct 28;8(43):eabq3668. doi: 10.1126/sciadv.abq3668. Epub 2022 Oct 26.
Nat Med. 2021 Feb;27(2):225-228. doi: 10.1038/s41591-020-1124-9. Epub 2020 Oct 20.
4
Cultural orientation, power, belief in conspiracy theories, and intentions to reduce the spread of COVID-19.文化取向、权力、阴谋论信仰和减少 COVID-19 传播的意愿。
Br J Soc Psychol. 2020 Jul;59(3):663-673. doi: 10.1111/bjso.12397. Epub 2020 Jun 27.
5
Pylons ablaze: Examining the role of 5G COVID-19 conspiracy beliefs and support for violence.塔台起火:探究 5G 新冠病毒阴谋论信仰与暴力支持的作用。
Br J Soc Psychol. 2020 Jul;59(3):628-640. doi: 10.1111/bjso.12394. Epub 2020 Jun 21.
6
A brief history of risk.风险简史。
Cognition. 2020 Oct;203:104344. doi: 10.1016/j.cognition.2020.104344. Epub 2020 Jun 8.
7
The dark side of social movements: social identity, non-conformity, and the lure of conspiracy theories.社会运动的阴暗面:社会认同、不墨守成规与阴谋论的诱惑
Curr Opin Psychol. 2020 Oct;35:1-6. doi: 10.1016/j.copsyc.2020.02.007. Epub 2020 Feb 21.
8
Recursive patterns in online echo chambers.在线回音室中的递归模式。
Sci Rep. 2019 Dec 27;9(1):20118. doi: 10.1038/s41598-019-56191-7.
9
Pathways to conspiracy: The social and linguistic precursors of involvement in Reddit's conspiracy theory forum.阴谋之路:参与 Reddit 阴谋论论坛的社会和语言前兆。
PLoS One. 2019 Nov 18;14(11):e0225098. doi: 10.1371/journal.pone.0225098. eCollection 2019.
10
"I was Right about Vaccination": Confirmation Bias and Health Literacy in Online Health Information Seeking.“我对疫苗接种的看法是正确的”:在线健康信息搜索中的确认偏误和健康素养。
J Health Commun. 2019;24(2):129-140. doi: 10.1080/10810730.2019.1583701. Epub 2019 Mar 21.