秘密在于声音：从无切分的语音到词汇类别。

The secret is in the sound: from unsegmented speech to lexical categories.

作者信息

Christiansen Morten H, Onnis Luca, Hockema Stephen A

机构信息

Department of Psychology, Cornell University, Ithaca, NY 14853, USA.

出版信息

Dev Sci. 2009 Apr;12(3):388-95. doi: 10.1111/j.1467-7687.2009.00824.x.

DOI:10.1111/j.1467-7687.2009.00824.x

PMID:19371361

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2743257/

Abstract

When learning language, young children are faced with many seemingly formidable challenges, including discovering words embedded in a continuous stream of sounds and determining what role these words play in syntactic constructions. We suggest that knowledge of phoneme distributions may play a crucial part in helping children segment words and determine their lexical category, and we propose an integrated model of how children might go from unsegmented speech to lexical categories. We corroborated this theoretical model using a two-stage computational analysis of a large corpus of English child-directed speech. First, we used transition probabilities between phonemes to find words in unsegmented speech. Second, we used distributional information about word edges--the beginning and ending phonemes of words--to predict whether the segmented words from the first stage were nouns, verbs, or something else. The results indicate that discovering lexical units and their associated syntactic category in child-directed speech is possible by attending to the statistics of single phoneme transitions and word-initial and final phonemes. Thus, we suggest that a core computational principle in language acquisition is that the same source of information is used to learn about different aspects of linguistic structure.

摘要

在学习语言时，幼儿面临着许多看似艰巨的挑战，包括从连续的语音流中发现单词，以及确定这些单词在句法结构中所起的作用。我们认为，音素分布的知识可能在帮助儿童分割单词并确定其词汇类别方面发挥关键作用，并且我们提出了一个关于儿童如何从未分割的语音过渡到词汇类别的综合模型。我们通过对大量针对儿童的英语语音语料库进行两阶段计算分析，证实了这一理论模型。首先，我们利用音素之间的转移概率在未分割的语音中找到单词。其次，我们利用关于单词边缘（单词的起始和结尾音素）的分布信息来预测第一阶段分割出的单词是名词、动词还是其他词类。结果表明，通过关注单个音素过渡以及单词起始和结尾音素的统计信息，在针对儿童的语音中发现词汇单元及其相关的句法类别是可能的。因此，我们认为语言习得中的一个核心计算原则是，相同的信息源被用于学习语言结构的不同方面。

相似文献

The secret is in the sound: from unsegmented speech to lexical categories.

Dev Sci. 2009 Apr;12(3):388-95. doi: 10.1111/j.1467-7687.2009.00824.x.

Adjacent and Non-Adjacent Word Contexts Both Predict Age of Acquisition of English Words: A Distributional Corpus Analysis of Child-Directed Speech.

Cogn Sci. 2020 Nov;44(11):e12899. doi: 10.1111/cogs.12899.

Lexical Acquisition and Phonological Development in Minimally Verbal Children With Autism Spectrum Disorders.

Lang Speech Hear Serv Sch. 2022 Oct 6;53(4):1074-1087. doi: 10.1044/2022_LSHSS-21-00184. Epub 2022 Aug 10.

The Role of Word Form in Lexical Selection of Late Talkers.

J Speech Lang Hear Res. 2025 May 8;68(5):2468-2477. doi: 10.1044/2025_JSLHR-24-00482. Epub 2025 Apr 23.

A computational model of word segmentation from continuous speech using transitional probabilities of atomic acoustic events.

Cognition. 2011 Aug;120(2):149-76. doi: 10.1016/j.cognition.2011.04.001. Epub 2011 Apr 27.

Categorizing words using 'frequent frames': what cross-linguistic analyses reveal about distributional acquisition strategies.

Dev Sci. 2009 Apr;12(3):396-406. doi: 10.1111/j.1467-7687.2009.00825.x.

Statistical learning beyond words in human neonates.

Elife. 2025 Feb 17;13:RP101802. doi: 10.7554/eLife.101802.

Phonological typicality influences on-line sentence comprehension.

Proc Natl Acad Sci U S A. 2006 Aug 8;103(32):12203-8. doi: 10.1073/pnas.0602173103. Epub 2006 Aug 1.

Simulating Early Phonetic and Word Learning Without Linguistic Categories.

Dev Sci. 2025 Mar;28(2):e13606. doi: 10.1111/desc.13606.

Learning and long-term retention of large-scale artificial languages.

PLoS One. 2013;8(1):e52500. doi: 10.1371/journal.pone.0052500. Epub 2013 Jan 2.

引用本文的文献

Exploring the "anchor word" effect in infants: Segmentation and categorisation of speech with and without high frequency words.

PLoS One. 2020 Dec 17;15(12):e0243436. doi: 10.1371/journal.pone.0243436. eCollection 2020.

Psych verbs, the linking problem, and the acquisition of language.

Cognition. 2016 Dec;157:268-288. doi: 10.1016/j.cognition.2016.08.008. Epub 2016 Sep 29.

Probabilistically-Cued Patterns Trump Perfect Cues in Statistical Language Learning.

Lang Learn Dev. 2013 Jan 1;9(1):66-87. doi: 10.1080/15475441.2012.685826.

Judging words by their covers and the company they keep: probabilistic cues support word learning.

Child Dev. 2014 Jul-Aug;85(4):1727-39. doi: 10.1111/cdev.12199. Epub 2013 Dec 6.

A role for the developing lexicon in phonetic category acquisition.

Psychol Rev. 2013 Oct;120(4):751-78. doi: 10.1037/a0034245.

Is statistical learning constrained by lower level perceptual organization?

Cognition. 2013 Jul;128(1):82-102. doi: 10.1016/j.cognition.2012.12.006. Epub 2013 Apr 22.

Statistical speech segmentation and word learning in parallel: scaffolding from child-directed speech.

Front Psychol. 2012 Oct 1;3:374. doi: 10.3389/fpsyg.2012.00374. eCollection 2012.

Arbitrary symbolism in natural language revisited: when word forms carry meaning.

PLoS One. 2012;7(8):e42286. doi: 10.1371/journal.pone.0042286. Epub 2012 Aug 6.

All words are not created equal: expectations about word length guide infant statistical learning.

Cognition. 2012 Feb;122(2):241-6. doi: 10.1016/j.cognition.2011.10.007. Epub 2011 Nov 14.

Interactions between statistical and semantic information in infant language development.

Dev Sci. 2011 Sep;14(5):1207-19. doi: 10.1111/j.1467-7687.2011.01073.x.

本文引用的文献

Lexical categories at the edge of the word.

Cogn Sci. 2008 Jan 2;32(1):184-221. doi: 10.1080/03640210701703691.

The Bristol norms for age of acquisition, imageability, and familiarity.

Behav Res Methods. 2006 Nov;38(4):598-605. doi: 10.3758/bf03193891.

The phonological-distributional coherence hypothesis: cross-linguistic evidence in language acquisition.

Cogn Psychol. 2007 Dec;55(4):259-305. doi: 10.1016/j.cogpsych.2006.12.001. Epub 2007 Feb 8.

Stress changes the representational landscape: evidence from word segmentation.

Cognition. 2005 Jul;96(3):233-62. doi: 10.1016/j.cognition.2004.08.005. Epub 2005 Jan 6.

Primacy and recency in nonword repetition.

Memory. 2005 Apr-May;13(3-4):318-24. doi: 10.1080/09658210344000350.

The differential role of phonological and distributional cues in grammatical categorisation.

Cognition. 2005 Jun;96(2):143-82. doi: 10.1016/j.cognition.2004.09.001. Epub 2004 Dec 24.

Age-of-acquisition effects in reading aloud: tests of cumulative frequency and frequency trajectory.

Mem Cognit. 2004 Jan;32(1):31-8. doi: 10.3758/bf03195818.

Frequent frames as a cue for grammatical categories in child directed speech.

Cognition. 2003 Nov;90(1):91-117. doi: 10.1016/s0010-0277(03)00140-9.

Root infinitives in Dutch early child language: an effect of input?

J Child Lang. 2001 Oct;28(3):629-60. doi: 10.1017/s0305000901004809.

Children's use of phonology to infer grammatical class in vocabulary learning.

Psychon Bull Rev. 2001 Sep;8(3):519-23. doi: 10.3758/bf03196187.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

秘密在于声音：从无切分的语音到词汇类别。

The secret is in the sound: from unsegmented speech to lexical categories.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献