Suppr超能文献

韵律对分词有多大帮助?针对婴儿导向语音的模拟研究。

How much does prosody help word segmentation? A simulation study on infant-directed speech.

机构信息

Laboratory for Language Development, RIKEN Center for Brain Science, Japan; Laboratoire de Sciences Cognitives et Psycholinguistique, ENS Paris Sciences Lettres, EHESS, CNRS, France.

Laboratoire de Sciences Cognitives et Psycholinguistique, ENS Paris Sciences Lettres, EHESS, CNRS, France.

出版信息

Cognition. 2022 Feb;219:104961. doi: 10.1016/j.cognition.2021.104961. Epub 2021 Nov 29.

Abstract

Infants come to learn several hundreds of word forms by two years of age, and it is possible this involves carving these forms out from continuous speech. It has been proposed that the task is facilitated by the presence of prosodic boundaries. We revisit this claim by running computational models of word segmentation, with and without prosodic information, on a corpus of infant-directed speech. We use five cognitively-based algorithms, which vary in whether they employ a sub-lexical or a lexical segmentation strategy and whether they are simple heuristics or embody an ideal learner. Results show that providing expert-annotated prosodic breaks does not uniformly help all segmentation models. The sub-lexical algorithms, which perform more poorly, benefit most, while the lexical ones show a very small gain. Moreover, when prosodic information is derived automatically from the acoustic cues infants are known to be sensitive to, errors in the detection of the boundaries lead to smaller positive effects, and even negative ones for some algorithms. This shows that even though infants could potentially use prosodic breaks, it does not necessarily follow that they should incorporate prosody into their segmentation strategies, when confronted with realistic signals.

摘要

婴儿在两岁时就能学会数百个单词的形式,而这可能涉及到从连续的语音中提取这些形式。有人提出,韵律边界的存在有助于完成这项任务。我们通过在婴儿导向的语音语料库上运行具有和不具有韵律信息的分词计算模型,重新审视了这一说法。我们使用了五种基于认知的算法,这些算法在是否采用词汇或子词汇分割策略以及是否采用简单的启发式或体现理想学习者方面有所不同。结果表明,提供专家注释的韵律停顿并不总是能帮助所有的分割模型。表现较差的子词汇算法受益最大,而词汇算法则只获得了很小的收益。此外,当韵律信息是从婴儿已知的声学线索中自动提取出来的,边界检测中的错误会导致较小的积极影响,而对某些算法甚至会产生负面影响。这表明,即使婴儿有可能利用韵律停顿,但当面对现实信号时,他们也不一定应该将韵律纳入其分割策略。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验