• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

提高 Google Ngram 研究可靠性的指南:宗教术语的证据。

Guideline for improving the reliability of Google Ngram studies: Evidence from religious terms.

机构信息

Department of Psychology, University of Konstanz, Konstanz, Germany.

出版信息

PLoS One. 2019 Mar 22;14(3):e0213554. doi: 10.1371/journal.pone.0213554. eCollection 2019.

DOI:10.1371/journal.pone.0213554
PMID:30901329
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6430395/
Abstract

The Google Books Ngram Viewer (Google Ngram) is a search engine that charts word frequencies from a large corpus of books and thereby allows for the examination of cultural change as it is reflected in books. While the tool's massive corpus of data (about 8 million books or 6% of all books ever published) has been used in various scientific studies, concerns about the accuracy of results have simultaneously emerged. This paper reviews the literature and serves as a guideline for improving Google Ngram studies by suggesting five methodological procedures suited to increase the reliability of results. In particular, we recommend the use of (I) different language corpora, (II) cross-checks on different corpora from the same language, (III) word inflections, (IV) synonyms, and (V) a standardization procedure that accounts for both the influx of data and unequal weights of word frequencies. Further, we outline how to combine these procedures and address the risk of potential biases arising from censorship and propaganda. As an example of the proposed procedures, we examine the cross-cultural expression of religion via religious terms for the years 1900 to 2000. Special emphasis is placed on the situation during World War II. In line with the strand of literature that emphasizes the decline of collectivistic values, our results suggest an overall decrease of religion's importance. However, religion re-gains importance during times of crisis such as World War II. By comparing the results obtained through the different methods, we illustrate that applying and particularly combining our suggested procedures increase the reliability of results and prevents authors from deriving wrong assumptions.

摘要

谷歌图书 N gram 查看器(Google Ngram)是一种搜索引擎,它可以从大量书籍语料库中绘制单词频率,从而可以检查书籍中反映的文化变化。虽然该工具庞大的语料库(约 800 万册书籍,占已出版书籍的 6%)已被用于各种科学研究,但同时也出现了对结果准确性的担忧。本文综述了文献,并提出了五种适合提高 Google Ngram 研究可靠性的方法程序,作为改进 Google Ngram 研究的指南。特别是,我们建议使用 (I) 不同的语言语料库、(II) 来自同一语言的不同语料库的交叉检查、(III) 词形变化、(IV) 同义词和 (V) 一种标准化程序,以考虑到数据的流入和单词频率的不等权重。此外,我们概述了如何结合这些程序并解决因审查和宣传而产生的潜在偏见的风险。作为所提议程序的一个示例,我们检查了 1900 年至 2000 年期间宗教术语的跨文化表达。特别强调了第二次世界大战期间的情况。与强调集体主义价值观下降的文献一致,我们的结果表明宗教的重要性总体上有所下降。然而,在像第二次世界大战这样的危机时期,宗教重新获得了重要性。通过比较通过不同方法获得的结果,我们说明了应用特别是结合我们建议的程序可以提高结果的可靠性,并防止作者得出错误的假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/7a017016a160/pone.0213554.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/0270d607e136/pone.0213554.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/60e084430748/pone.0213554.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/83b0585a603b/pone.0213554.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/7a017016a160/pone.0213554.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/0270d607e136/pone.0213554.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/60e084430748/pone.0213554.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/83b0585a603b/pone.0213554.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b8c/6430395/7a017016a160/pone.0213554.g004.jpg

相似文献

1
Guideline for improving the reliability of Google Ngram studies: Evidence from religious terms.提高 Google Ngram 研究可靠性的指南:宗教术语的证据。
PLoS One. 2019 Mar 22;14(3):e0213554. doi: 10.1371/journal.pone.0213554. eCollection 2019.
2
The changing psychology of culture in German-speaking countries: A Google Ngram study.德语国家文化心理的变迁:一项谷歌Ngram研究
Int J Psychol. 2018 Oct;53 Suppl 1:53-62. doi: 10.1002/ijop.12428. Epub 2017 May 5.
3
Increasing digitalization is associated with anxiety and depression: A Google Ngram analysis.数字化程度的提高与焦虑和抑郁有关:谷歌 N gram 分析。
PLoS One. 2023 Apr 7;18(4):e0284091. doi: 10.1371/journal.pone.0284091. eCollection 2023.
4
Culturomics and the history of psychiatry: testing the Google Ngram method.文化组学与精神病学史:检验谷歌图书词频统计方法
Ir J Psychol Med. 2019 Mar;36(1):23-27. doi: 10.1017/ipm.2017.37.
5
Historical time in the age of big data: Cultural psychology, historical change, and the Google Books Ngram Viewer.大数据时代的历史时间:文化心理学、历史变迁与谷歌图书Ngram Viewer
Hist Psychol. 2016 May;19(2):141-153. doi: 10.1037/hop0000023.
6
Oldsters and Ngrams: age stereotypes across time.老年人与谷歌图书词频统计:不同时期的年龄刻板印象
Psychol Rep. 2015 Feb;116(1):324-9. doi: 10.2466/17.10.PR0.116k17w6. Epub 2015 Feb 4.
7
Analysis of Historical Medical Phenomena Using Large N-Gram Corpora.
Stud Health Technol Inform. 2017;245:437-441.
8
Transition to market economy promotes individualistic values: Analysing changes in frequencies of Russian words from 1980 to 2008.向市场经济转型促进个人主义价值观:分析1980年至2008年俄语词汇频率的变化。
Int J Psychol. 2019 Feb;54(1):23-32. doi: 10.1002/ijop.12411. Epub 2017 Jan 11.
9
Not all cultural values are created equal: Cultural change in China reexamined through Google books.并非所有文化价值观都是平等的:通过谷歌图书重新审视中国的文化变迁
Int J Psychol. 2019 Feb;54(1):144-154. doi: 10.1002/ijop.12436. Epub 2017 Jun 20.
10
The rise and fall of rationality in language.语言合理性的兴衰。
Proc Natl Acad Sci U S A. 2021 Dec 21;118(51). doi: 10.1073/pnas.2107848118.

引用本文的文献

1
Pathogen stress heightens sensorimotor dimensions in the human collective semantic space.病原体应激增强了人类集体语义空间中的感觉运动维度。
Commun Psychol. 2025 Jan 5;3(1):2. doi: 10.1038/s44271-024-00183-5.
2
The impact of terrorist attacks on cultural values as expressed in books.恐怖袭击对书籍所表达的文化价值观的影响。
PLoS One. 2024 Nov 22;19(11):e0311095. doi: 10.1371/journal.pone.0311095. eCollection 2024.
3
Benford's Law applies to word frequency rank in English, German, French, Spanish, and Italian.本福德定律适用于英语、德语、法语、西班牙语和意大利语中的单词频率排名。

本文引用的文献

1
Relationship between collectivism and corruption in American and Chinese books: A historical perspective.美国和中国书籍中集体主义与腐败的关系:历史视角
Int J Psychol. 2019 Apr;54(2):180-187. doi: 10.1002/ijop.12447. Epub 2017 Jul 13.
2
Resisting temptation for the good of the group: Binding moral values and the moralization of self-control.为了集体的利益抵制诱惑:约束道德价值观和自我控制的道德化。
J Pers Soc Psychol. 2018 Sep;115(3):585-599. doi: 10.1037/pspp0000149. Epub 2017 Jun 12.
3
The changing psychology of culture in German-speaking countries: A Google Ngram study.
PLoS One. 2023 Sep 14;18(9):e0291337. doi: 10.1371/journal.pone.0291337. eCollection 2023.
4
Evaluating the use of Instagram images color histograms and hashtags sets for automatic image annotation.评估使用Instagram图像颜色直方图和主题标签集进行自动图像标注的情况。
Front Big Data. 2023 Jul 4;6:1149523. doi: 10.3389/fdata.2023.1149523. eCollection 2023.
5
Increasing digitalization is associated with anxiety and depression: A Google Ngram analysis.数字化程度的提高与焦虑和抑郁有关:谷歌 N gram 分析。
PLoS One. 2023 Apr 7;18(4):e0284091. doi: 10.1371/journal.pone.0284091. eCollection 2023.
6
Multi-LEX: A database of multi-word frequencies for French and English.多词频库:法语和英语的多词频数据库。
Behav Res Methods. 2023 Dec;55(8):4315-4328. doi: 10.3758/s13428-022-02018-9. Epub 2022 Nov 28.
7
Historical representations of social groups across 200 years of word embeddings from Google Books.200 年谷歌书籍语料库中的词嵌入技术对社会群体的历史描述。
Proc Natl Acad Sci U S A. 2022 Jul 12;119(28):e2121798119. doi: 10.1073/pnas.2121798119. Epub 2022 Jul 5.
8
The rise and fall of rationality in language.语言合理性的兴衰。
Proc Natl Acad Sci U S A. 2021 Dec 21;118(51). doi: 10.1073/pnas.2107848118.
9
The neuroscience of social feelings: mechanisms of adaptive social functioning.社会情感的神经科学:适应社会功能的机制。
Neurosci Biobehav Rev. 2021 Sep;128:592-620. doi: 10.1016/j.neubiorev.2021.05.028. Epub 2021 Jun 2.
10
COVID-19 shifts mortality salience, activities, and values in the United States: Big data analysis of online adaptation.新冠疫情改变了美国的死亡显著性、活动和价值观:在线适应的大数据分析
Hum Behav Emerg Technol. 2021 Jan;3(1):107-126. doi: 10.1002/hbe2.251. Epub 2021 Feb 9.
德语国家文化心理的变迁:一项谷歌Ngram研究
Int J Psychol. 2018 Oct;53 Suppl 1:53-62. doi: 10.1002/ijop.12428. Epub 2017 May 5.
4
A Growing Disconnection From Nature Is Evident in Cultural Products.文化产品明显反映出人类与自然日益脱节。
Perspect Psychol Sci. 2017 Mar;12(2):258-269. doi: 10.1177/1745691616662473.
5
Transition to market economy promotes individualistic values: Analysing changes in frequencies of Russian words from 1980 to 2008.向市场经济转型促进个人主义价值观:分析1980年至2008年俄语词汇频率的变化。
Int J Psychol. 2019 Feb;54(1):23-32. doi: 10.1002/ijop.12411. Epub 2017 Jan 11.
6
Birth of the cool: a two-centuries decline in emotional expression in Anglophone fiction.“酷”的诞生:英语小说中情感表达两个世纪的衰退
Cogn Emot. 2017 Dec;31(8):1663-1675. doi: 10.1080/02699931.2016.1260528. Epub 2016 Dec 2.
7
Historical time in the age of big data: Cultural psychology, historical change, and the Google Books Ngram Viewer.大数据时代的历史时间:文化心理学、历史变迁与谷歌图书Ngram Viewer
Hist Psychol. 2016 May;19(2):141-153. doi: 10.1037/hop0000023.
8
INTEREST IN ASTROLOGY AND PHRENOLOGY OVER TWO CENTURIES: A GOOGLE NGRAM STUDY.两个多世纪以来对占星术和颅相学的兴趣:一项谷歌Ngram研究
Psychol Rep. 2015 Dec;117(3):940-3. doi: 10.2466/17.PR0.117c27z8. Epub 2015 Nov 23.
9
Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution.描述谷歌图书语料库:社会文化与语言演变推断的严格限制
PLoS One. 2015 Oct 7;10(10):e0137041. doi: 10.1371/journal.pone.0137041. eCollection 2015.
10
Mental Representations of Weekdays.工作日的心理表征
PLoS One. 2015 Aug 19;10(8):e0134555. doi: 10.1371/journal.pone.0134555. eCollection 2015.