Suppr超能文献

CCLOWW:一个中文儿童书面词汇的年级水平词库。

CCLOWW: A grade-level Chinese children's lexicon of written words.

机构信息

Key Laboratory of Brain Functional Genomics (MOE & STCSM), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China.

School of Psychology, Nanjing Normal University, Nanjing, China.

出版信息

Behav Res Methods. 2023 Jun;55(4):1874-1889. doi: 10.3758/s13428-022-01890-9. Epub 2022 Jul 1.

Abstract

In this article, we present the Chinese Children's Lexicon of Written Words (CCLOWW), the first grade-level database that provides frequency statistics of simplified Chinese characters and words for children. The database computes from a corpus of 34,671,424 character tokens and 22,427,010 word tokens (including single- and multicharacter words), extracted from 2131 books. It contains 6746 different character types and 153,079 different word types. CCLOWW provides several frequency indices of simplified Chinese for three grade levels (grade 2 and below, grades 3-4, grades 5-6) to profile children's experience with written Chinese in and outside of school. We describe in this article the distributions of frequency and contextual diversity of the characters and words, as well as word length and syntactic categories of the words in the corpus and the subcorpora. We also report results of correlation analyses with other written corpora and of several naming and lexicon decision experiments. The findings suggest that CCLOWW frequency measures correlate well with other corpora. Importantly, they could reliably predict children's and adults' naming and lexical decision performances. They could also explain variance in adults' visual word recognition, in addition to frequency measures computed in an adult corpus, indicating that early print exposure might influence readers' lexical processing later on beyond an age of acquisition effect. CCLOWW will help researchers in language processing and development as well as educators with selecting language materials appropriate for children's developmental stages. The database is freely available online at https://www.learn2read.cn/database/ .

摘要

本文介绍了《中国儿童现代汉语词汇表》(CCLOWW),这是第一个提供汉字和词汇简体中文频率统计数据的年级数据库,适用于儿童。该数据库基于 2131 本书籍中的 34671424 个字符和 22427010 个单词(包括单字和多字词)进行计算。它包含 6746 种不同的字符类型和 153079 种不同的单词类型。CCLOWW 为三个年级(2 年级及以下、3-4 年级、5-6 年级)提供了几种简体中文的频率指标,以描述儿童在校内外使用书面汉语的情况。本文描述了字符和单词的频率和上下文多样性分布,以及语料库和子语料库中单词的长度和句法类别。我们还报告了与其他书面语料库的相关分析结果以及几个命名和词汇决策实验的结果。研究结果表明,CCLOWW 的频率测量与其他语料库高度相关。重要的是,它们可以可靠地预测儿童和成人的命名和词汇决策表现。此外,它们还可以解释成人视觉词汇识别中的方差,除了在成人语料库中计算的频率度量之外,这表明早期的印刷品接触可能会影响读者的词汇处理,超出习得年龄效应。CCLOWW 将帮助语言处理和发展研究人员以及教育工作者选择适合儿童发展阶段的语言材料。该数据库可在 https://www.learn2read.cn/database/ 上免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验