儿童口语中词汇多样性的测量：计算与概念考量

Measurement of Lexical Diversity in Children's Spoken Language: Computational and Conceptual Considerations.

作者信息

Yang Ji Seung, Rosvold Carly, Bernstein Ratner Nan

机构信息

Department of Human Development and Quantitative Methodology, University of Maryland, College Park, College Park, MD, United States.

Department of Hearing and Speech Sciences, Program in Neuroscience and Cognitive Science, College Park, MD, United States.

出版信息

Front Psychol. 2022 Jun 22;13:905789. doi: 10.3389/fpsyg.2022.905789. eCollection 2022.

DOI:10.3389/fpsyg.2022.905789

PMID:35814069

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9257278/

Abstract

BACKGROUND

Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability.

MATERIALS AND METHODS

We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2-6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity.

RESULTS

Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children's expressive vocabulary skill.

DISCUSSION

This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed.

摘要

背景

词类-形类比（TTR）因其计算相对简单，是临床医生在日常实践中计算的少数几种语言样本分析（LSA）指标之一。然而，它有诸多已被充分记录的显著缺点；这些缺点包括作为样本量函数的不稳定性，以及在幼儿期缺乏清晰的发展轨迹。人们已经提出了各种词汇多样性的替代指标；有些指标，如每100个词中不同词的数量（NDW）也可以手动计算。然而，其他指标，如词汇多样性（VocD）和移动平均词类-形类比（MATTR）则依赖于无法手动进行的复杂重采样算法。迄今为止，尚未有对所有这四种指标的大规模研究评估它们在多大程度上能够捕捉幼儿期典型的发展趋势，或者它们是否能可靠地区分典型与非典型的儿童表达性语言能力特征。

材料与方法

我们对从儿童语言数据交换系统（CHILDES）中抽取的946个来自典型发展的学龄前儿童（2至6岁）成人-儿童玩具玩耍语料库样本的TTR、NDW、VocD和MATTR分数进行了线性和非线性回归分析。这些样本与504个已知有表达性语言技能延迟儿童的样本（总共1454个样本）进行了对比。我们还进行了一项单独的子分析，研究了采样环境对词汇多样性可能产生的背景影响。

结果

只有VocD在典型发展儿童组和发展延迟儿童组之间显示出显著不同的平均分数。使用TTR实际上会误诊典型儿童，并遗漏已知有语言障碍的儿童。然而，将VocD作为玩具互动函数的计算结果具有显著性，这进一步提醒我们在将词汇多样性用作儿童表达性词汇技能的有效替代指标时要谨慎。

讨论

这项对幼儿表达性词汇特征的计算机实现算法与传统手工计算指标进行的大规模统计比较表明，只有VocD符合语言样本分析中基于证据使用的标准。然而，VocD受到样本引出背景的影响，这表明非语言因素，如与引出道具的互动，会干扰对幼儿口语词汇技能的估计。文中讨论了相关影响及建议方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9df/9257278/9e10084d07ae/fpsyg-13-905789-g001.jpg

相似文献

Measurement of Lexical Diversity in Children's Spoken Language: Computational and Conceptual Considerations.

Front Psychol. 2022 Jun 22;13:905789. doi: 10.3389/fpsyg.2022.905789. eCollection 2022.

Should We Stop Using Lexical Diversity Measures in Children's Language Sample Analysis?

Am J Speech Lang Pathol. 2024 Jul 3;33(4):1986-2001. doi: 10.1044/2024_AJSLP-23-00457. Epub 2024 Jun 5.

Measuring lexical diversity in children who stutter: application of vocd.

J Fluency Disord. 2002 Winter;27(4):289-303, quiz 303-4. doi: 10.1016/s0094-730x(02)00162-6.

Lexical diversity and lexical skills in children who stutter.

J Fluency Disord. 2020 Mar;63:105747. doi: 10.1016/j.jfludis.2020.105747. Epub 2020 Jan 22.

Psychometric Evaluation of Lexical Diversity Indices in Spanish Narrative Samples From Children With and Without Developmental Language Disorder.

J Speech Lang Hear Res. 2019 Jan 30;62(1):70-83. doi: 10.1044/2018_JSLHR-L-18-0110.

Relationship Between Children's Lexical Diversity in Written Narratives and Performance on a Standardized Reading Vocabulary Measure.

Assess Eff Interv. 2019 Jun;44(3):173-183. doi: 10.1177/1534508417749872. Epub 2018 Jan 23.

Communication interventions for autism spectrum disorder in minimally verbal children.

Cochrane Database Syst Rev. 2018 Nov 5;11(11):CD012324. doi: 10.1002/14651858.CD012324.pub2.

Word type and modality in the emerging expressive vocabularies of preschool children with Down syndrome.

Int J Lang Commun Disord. 2023 May;58(3):864-878. doi: 10.1111/1460-6984.12828. Epub 2022 Dec 20.

Lexical diversity and omission errors as predictors of language ability in the narratives of sequential Spanish-English bilinguals: a cross-language comparison.

Am J Speech Lang Pathol. 2013 Aug;22(3):554-65. doi: 10.1044/1058-0360(2013/11-0055). Epub 2013 Jun 28.

Lexical diversity in the spontaneous speech of children with specific language impairment: application of D.

J Speech Lang Hear Res. 2002 Oct;45(5):927-37. doi: 10.1044/1092-4388(2002/075).

引用本文的文献

Supporting Oral Language Development in Preschool Children Through Instructional Scaffolding During Drawing Activity: A Qualitative Case Study.

Behav Sci (Basel). 2025 Jul 4;15(7):908. doi: 10.3390/bs15070908.

Establishing Norm of Connected Speech Measures for Descriptive Discourses in Cantonese-Speaking Adults.

Int J Lang Commun Disord. 2025 May-Jun;60(3):e70055. doi: 10.1111/1460-6984.70055.

Open Brain AI and language assessment.

Front Hum Neurosci. 2024 Aug 6;18:1421435. doi: 10.3389/fnhum.2024.1421435. eCollection 2024.

Should We Stop Using Lexical Diversity Measures in Children's Language Sample Analysis?

Am J Speech Lang Pathol. 2024 Jul 3;33(4):1986-2001. doi: 10.1044/2024_AJSLP-23-00457. Epub 2024 Jun 5.

Assessment and Therapy Goal Planning Using Free Computerized Language Analysis Software.

Perspect ASHA Spec Interest Groups. 2023 Feb;8(1):19-31. doi: 10.1044/2022_persp-22-00156. Epub 2023 Feb 6.

Stalling for Time: Stall, Revision, and Stuttering-Like Disfluencies Reflect Language Factors in the Speech of Young Children.

J Speech Lang Hear Res. 2023 Jun 20;66(6):2018-2034. doi: 10.1044/2023_JSLHR-22-00595. Epub 2023 May 24.

本文引用的文献

Predicting Language Performance From Narrative Language Samples.

J Speech Lang Hear Res. 2022 Feb 9;65(2):775-784. doi: 10.1044/2021_JSLHR-21-00262. Epub 2022 Jan 6.

The Index of Productive Syntax: Psychometric Properties and Suggested Modifications.

Am J Speech Lang Pathol. 2022 Jan 18;31(1):239-256. doi: 10.1044/2021_AJSLP-21-00084. Epub 2021 Nov 8.

Language Sample Analysis in Clinical Practice: Speech-Language Pathologists' Barriers, Facilitators, and Needs.

Lang Speech Hear Serv Sch. 2022 Jan 5;53(1):1-16. doi: 10.1044/2021_LSHSS-21-00026. Epub 2021 Oct 25.

Variability in Quantity and Quality of Early Linguistic Experience in Children With Cochlear Implants: Evidence from Analysis of Natural Auditory Environments.

Ear Hear. 2022 Mar/Apr;43(2):685-698. doi: 10.1097/AUD.0000000000001136.

Relationship Between Children's Lexical Diversity in Written Narratives and Performance on a Standardized Reading Vocabulary Measure.

Assess Eff Interv. 2019 Jun;44(3):173-183. doi: 10.1177/1534508417749872. Epub 2018 Jan 23.

Using Free Computer-Assisted Language Sample Analysis to Evaluate and Set Treatment Goals for Children Who Speak African American English.

Lang Speech Hear Serv Sch. 2021 Jan 19;52(1):31-50. doi: 10.1044/2020_LSHSS-19-00107. Epub 2021 Jan 18.

Automatized analysis of children's exposure to child-directed speech in reschool settings: Validation and application.

PLoS One. 2020 Nov 25;15(11):e0242511. doi: 10.1371/journal.pone.0242511. eCollection 2020.

Taking Language Samples Home: Feasibility, Reliability, and Validity of Child Language Samples Conducted Remotely With Video Chat Versus In-Person.

J Speech Lang Hear Res. 2020 Dec 14;63(12):3982-3990. doi: 10.1044/2020_JSLHR-20-00202. Epub 2020 Nov 13.

Properties of Lexical Diversity in the Narratives of Children With Typical Language Development and Developmental Language Disorder.

Am J Speech Lang Pathol. 2020 Nov 12;29(4):1866-1882. doi: 10.1044/2020_AJSLP-19-00176. Epub 2020 Jul 17.

Using Computerized Language Analysis to Evaluate Grammatical Skills.

Lang Speech Hear Serv Sch. 2020 Apr 7;51(2):184-204. doi: 10.1044/2019_LSHSS-19-00032.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

儿童口语中词汇多样性的测量：计算与概念考量

Measurement of Lexical Diversity in Children's Spoken Language: Computational and Conceptual Considerations.

作者信息

机构信息

出版信息

BACKGROUND

MATERIALS AND METHODS

RESULTS

DISCUSSION

背景

材料与方法

结果

讨论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献