Integrative Neuroscience and Cognition Center (INCC-UMR 8002), Université Paris Descartes (Sorbonne Paris Cité), France; Integrative Neuroscience and Cognition Center (INCC-UMR 8002), CNRS, France.
Department of Psychology, University of British Columbia, Canada.
Lang Speech. 2020 Jun;63(2):264-291. doi: 10.1177/0023830919842353. Epub 2019 Apr 19.
The audiovisual speech signal contains multimodal information to phrase boundaries. In three artificial language learning studies with 12 groups of adult participants we investigated whether English monolinguals and bilingual speakers of English and a language with opposite basic word order (i.e., in which objects precede verbs) can use word frequency, phrasal prosody and co-speech (facial) visual information, namely head nods, to parse unknown languages into phrase-like units. We showed that monolinguals and bilinguals used the auditory and visual sources of information to chunk "phrases" from the input. These results suggest that speech segmentation is a bimodal process, though the influence of co-speech facial gestures is rather limited and linked to the presence of auditory prosody. Importantly, a pragmatic factor, namely the language of the context, seems to determine the bilinguals' segmentation, overriding the auditory and visual cues and revealing a factor that begs further exploration.
语音信号中包含有关短语边界的多模态信息。在三项人工语言学习研究中,我们调查了英语单语者和英语双语者以及具有相反基本语序的语言(即,其中宾语先于动词)的双语者是否可以使用词频、韵律短语和协同言语(面部)视觉信息,即点头,将未知语言分割成语类似的单位。结果表明,单语者和双语者都使用听觉和视觉信息来源将输入内容分割成“短语”。这些结果表明,言语分割是一个双模态过程,尽管协同言语面部姿势的影响相当有限,并且与听觉韵律有关。重要的是,语用因素,即上下文的语言,似乎决定了双语者的分割,忽略了听觉和视觉线索,并揭示了一个需要进一步探索的因素。