Suppr超能文献

评估谷歌图书的词频在词汇加工心理语言学研究中的有用性。

Assessing the usefulness of google books' word frequencies for psycholinguistic research on word processing.

作者信息

Brysbaert Marc, Keuleers Emmanuel, New Boris

机构信息

Department of Experimental Psychology, Ghent University Ghent, Belgium.

出版信息

Front Psychol. 2011 Mar 2;2:27. doi: 10.3389/fpsyg.2011.00027. eCollection 2011.

Abstract

In this Perspective Article we assess the usefulness of Google's new word frequencies for word recognition research (lexical decision and word naming). We find that, despite the massive corpus on which the Google estimates are based (131 billion words from books published in the United States alone), the Google American English frequencies explain 11% less of the variance in the lexical decision times from the English Lexicon Project (Balota et al., 2007) than the SUBTLEX-US word frequencies, based on a corpus of 51 million words from film and television subtitles. Further analyses indicate that word frequencies derived from recent books (published after 2000) are better predictors of word processing times than frequencies based on the full corpus, and that word frequencies based on fiction books predict word processing times better than word frequencies based on the full corpus. The most predictive word frequencies from Google still do not explain more of the variance in word recognition times of undergraduate students and old adults than the subtitle-based word frequencies.

摘要

在这篇观点文章中,我们评估了谷歌新的词频在单词识别研究(词汇判断和单词命名)中的有用性。我们发现,尽管谷歌的估计基于庞大的语料库(仅来自美国出版书籍的1310亿个单词),但与基于5100万个来自电影和电视字幕语料库的SUBTLEX-US词频相比,谷歌美国英语词频对来自英语词汇项目(Balota等人,2007年)的词汇判断时间方差的解释要少11%。进一步的分析表明,来自近期书籍(2000年后出版)的词频比基于完整语料库的词频更能预测单词处理时间,并且基于小说书籍的词频比基于完整语料库的词频能更好地预测单词处理时间。谷歌最具预测性的词频在解释大学生和老年人单词识别时间的方差方面,仍不如基于字幕的词频。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e2a/3111095/90db65c08126/fpsyg-02-00027-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验