Suppr超能文献

分布规律和音位结构限制对切分很有用。

Distributional regularity and phonotactic constraints are useful for segmentation.

作者信息

Brent M R, Cartwright T A

机构信息

Department of Cognitive Science, Johns Hopkins University, Baltimore, MD 21218, USA.

出版信息

Cognition. 1996 Oct-Nov;61(1-2):93-125. doi: 10.1016/s0010-0277(96)00719-6.

Abstract

In order to acquire a lexicon, young children must segment speech into words, even though most words are unfamiliar to them. This is a non-trivial task because speech lacks any acoustic analog of the blank spaces between printed words. Two sources of information that might be useful for this task are distributional regularity and phonotactic constraints. Informally, distributional regularity refers to the intuition that sound sequences that occur frequently and in a variety of contexts are better candidates for the lexicon than those that occur rarely or in few contexts. We express that intuition formally by a class of functions called DR functions. We then put forth three hypotheses: First, that children segment using DR functions. Second, that they exploit phonotactic constraints on the possible pronunciations of words in their language. Specifically, they exploit both the requirement that every word must have a vowel and the constraints that languages impose on word-initial and word-final consonant clusters. Third, that children learn which word-boundary clusters are permitted in their language by assuming that all permissible word-boundary clusters will eventually occur at utterance boundaries. Using computational simulation, we investigate the effectiveness of these strategies for segmenting broad phonetic transcripts of child-directed English. The results show that DR functions and phonotactic constraints can be used to significantly improve segmentation. Further, the contributions of DR functions and phonotactic constraints are largely independent, so using both yields better segmentation than using either one alone. Finally, learning the permissible word-boundary clusters from utterance boundaries does not degrade segmentation performance.

摘要

为了习得词汇,幼儿必须将言语切分为单词,即便大多数单词对他们来说并不熟悉。这并非一项轻而易举的任务,因为言语中缺乏印刷文字之间空格的声学类似物。对于这项任务可能有用的两种信息来源是分布规律和音位结构限制。非正式地讲,分布规律指的是这样一种直觉,即频繁出现且出现在各种语境中的音序列比那些很少出现或出现在少数语境中的音序列更有可能是词汇。我们通过一类称为DR函数的函数来正式表达这种直觉。然后我们提出三个假设:第一,儿童使用DR函数进行切分。第二,他们利用对其语言中单词可能发音的音位结构限制。具体来说,他们利用每个单词都必须有一个元音这一要求以及语言对词首和词尾辅音群施加的限制。第三,儿童通过假设所有允许的词边界群最终都会出现在话语边界来学习他们的语言中哪些词边界群是允许的。通过计算模拟,我们研究了这些策略对切分面向儿童的英语宽泛语音转录本的有效性。结果表明,DR函数和音位结构限制可用于显著提高切分效果。此外,DR函数和音位结构限制的贡献在很大程度上是独立的,因此同时使用两者比单独使用任何一个能产生更好的切分效果。最后,从话语边界学习允许的词边界群不会降低切分性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验