Suppr超能文献

基于160万个美式英语单词的口语词汇频率统计。

Spoken word frequency counts based on 1.6 million words in American English.

作者信息

Pastizzo Matrhew J, Carbone Robert F

机构信息

Psychology Department, State University of New York, Geneseo 14454, USA.

出版信息

Behav Res Methods. 2007 Nov;39(4):1025-8. doi: 10.3758/bf03193000.

Abstract

Written word frequency (e.g., Francis & Ku6era, 1982; Kucera & Francis, 1967) constitutes apopular measure of word familiarity, which is highly predictive of word recognition. Far less often, researchers employ spoken frequency counts in their studies. This discrepancy can be attributed most readily to the conspicuous absence of a sizeable spoken frequency count for American English. The present article reports the construction of a 1.6-million-word spoken frequency database derived from the Michigan Corpus of Academic Spoken English (Simpson, Swales, & Briggs, 2002). We generated spoken frequency counts for 34,922 words and extracted speaker attributes from the source material to generate relative frequencies of words spoken by each speaker category. We assessthe predictive validity of these counts, and discuss some possible applications outside of word recognition studies.

摘要

书面词频(例如,弗朗西斯和库泽拉,1982年;库泽拉和弗朗西斯,1967年)是衡量单词熟悉度的常用指标,它对单词识别具有高度预测性。研究人员在其研究中使用口语词频计数的情况则要少得多。这种差异最容易归因于美国英语缺乏大量的口语词频计数。本文报告了一个基于密歇根学术英语口语语料库(辛普森、斯韦尔斯和布里格斯,2002年)构建的160万词口语词频数据库。我们生成了34922个单词的口语词频计数,并从源材料中提取了说话者属性,以生成每个说话者类别所说单词的相对频率。我们评估了这些计数的预测效度,并讨论了单词识别研究之外的一些可能应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验