Suppr超能文献

SUBTLEX-CY:威尔士语新的单词频率数据库。

SUBTLEX-CY: A new word frequency database for Welsh.

机构信息

School of Psychology, University of Nottingham, Nottingham, UK.

School of Psychology, Wrexham Glyndŵr University, Wrexham, UK.

出版信息

Q J Exp Psychol (Hove). 2024 May;77(5):1052-1067. doi: 10.1177/17470218231190315. Epub 2023 Aug 30.

Abstract

We present SUBTLEX-CY, a new word frequency database created from a 32-million-word corpus of Welsh television subtitles. An experiment comprising a lexical decision task examined SUBTLEX-CY frequency estimates against words with inconsistent frequencies in a much smaller Welsh corpus that is often used by researchers, the (CEG), and three other Welsh word frequency databases. Words were selected that were classified as low frequency (LF) in SUBTLEX-CY and high frequency (HF) in CEG and compared with words that were classified as medium frequency (MF) in both SUBTLEX-CY and CEG. Reaction time analyses showed that HF words in CEG were responded to more slowly compared to MF words, suggesting that SUBTLEX-CY corpus provides a more reliable estimate of Welsh word frequencies. The new Welsh word frequency database that also includes part-of-speech, contextual diversity, and other lexical information is freely available for research purposes on the Open Science Framework repository at https://osf.io/9gkqm/.

摘要

我们呈现了 SUBTLEX-CY,这是一个基于 3200 万词威尔士电视字幕语料库创建的新的单词频率数据库。一项包含词汇判断任务的实验,将 SUBTLEX-CY 的频率估计与在一个经常被研究人员使用的较小的威尔士语料库(CEG)中频率不一致的单词进行了比较,该语料库还包括三个其他的威尔士单词频率数据库。我们选择了在 SUBTLEX-CY 中被归类为低频 (LF) 而在 CEG 中被归类为高频 (HF) 的单词,并将其与在 SUBTLEX-CY 和 CEG 中都被归类为中频 (MF) 的单词进行了比较。反应时间分析表明,CEG 中的 HF 单词的反应速度比 MF 单词慢,这表明 SUBTLEX-CY 语料库提供了更可靠的威尔士单词频率估计。这个新的威尔士单词频率数据库还包括词性、语境多样性和其他词汇信息,可在开放科学框架存储库(https://osf.io/9gkqm/)上免费用于研究目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d005/11032624/53890ab0bb4b/10.1177_17470218231190315-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验