• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

音节推断作为口语理解的一种机制。

Syllable Inference as a Mechanism for Spoken Language Understanding.

机构信息

Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.

Department of Psychiatry and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.

出版信息

Top Cogn Sci. 2021 Apr;13(2):351-398. doi: 10.1111/tops.12529. Epub 2021 Mar 29.

DOI:10.1111/tops.12529
PMID:33780156
Abstract

A classic problem in spoken language comprehension is how listeners perceive speech as being composed of discrete words, given the variable time-course of information in continuous signals. We propose a syllable inference account of spoken word recognition and segmentation, according to which alternative hierarchical models of syllables, words, and phonemes are dynamically posited, which are expected to maximally predict incoming sensory input. Generative models are combined with current estimates of context speech rate drawn from neural oscillatory dynamics, which are sensitive to amplitude rises. Over time, models which result in local minima in error between predicted and recently experienced signals give rise to perceptions of hearing words. Three experiments using the visual world eye-tracking paradigm with a picture-selection task tested hypotheses motivated by this framework. Materials were sentences that were acoustically ambiguous in numbers of syllables, words, and phonemes they contained (cf. English plural constructions, such as "saw (a) raccoon(s) swimming," which have two loci of grammatical information). Time-compressing, or expanding, speech materials permitted determination of how temporal information at, or in the context of, each locus affected looks to, and selection of, pictures with a singular or plural referent (e.g., one or more than one raccoon). Supporting our account, listeners probabilistically interpreted identical chunks of speech as consistent with a singular or plural referent to a degree that was based on the chunk's gradient rate in relation to its context. We interpret these results as evidence that arriving temporal information, judged in relation to language model predictions generated from context speech rate evaluated on a continuous scale, informs inferences about syllables, thereby giving rise to perceptual experiences of understanding spoken language as words separated in time.

摘要

口语理解中的一个经典问题是,鉴于连续信号中信息的时变过程,听众如何将语音感知为离散的单词。我们提出了一种音节推断的口语识别和分割方法,根据该方法,音节、单词和音素的替代层次模型是动态提出的,这些模型预计将最大限度地预测传入的感觉输入。生成模型与当前从神经振荡动力学中得出的上下文语音率的估计相结合,这些估计对幅度上升敏感。随着时间的推移,导致预测信号和最近经历的信号之间的误差局部最小的模型会产生听到单词的感觉。使用视觉世界眼动追踪范式和图片选择任务进行了三个实验,以检验该框架所激发的假设。材料是音节、单词和音素数量上有歧义的句子(例如英语复数结构,如“saw (a) raccoon(s) swimming”,其中有两个语法信息位置)。压缩或扩展语音材料可以确定在每个位置或在位置上下文中的时间信息如何影响对单数或复数指称的图片的注视和选择(例如,一个或多个浣熊)。支持我们的说法,听众以基于与上下文相关的渐变率的程度,概率性地将相同的语音片段解释为与单数或复数指称一致。我们将这些结果解释为,到达的时间信息,根据从上下文语音率生成的语言模型预测进行判断,并在连续尺度上进行评估,从而提供有关音节的信息,从而产生对口语作为时间上分离的单词的理解的感知体验。

相似文献

1
Syllable Inference as a Mechanism for Spoken Language Understanding.音节推断作为口语理解的一种机制。
Top Cogn Sci. 2021 Apr;13(2):351-398. doi: 10.1111/tops.12529. Epub 2021 Mar 29.
2
Balancing Prediction and Sensory Input in Speech Comprehension: The Spatiotemporal Dynamics of Word Recognition in Context.在言语理解中平衡预测和感觉输入:语境中单词识别的时空动态。
J Neurosci. 2019 Jan 16;39(3):519-527. doi: 10.1523/JNEUROSCI.3573-17.2018. Epub 2018 Nov 20.
3
Predictive Neural Computations Support Spoken Word Recognition: Evidence from MEG and Competitor Priming.预测性神经计算支持口语识别:来自 MEG 和竞争启动的证据。
J Neurosci. 2021 Aug 11;41(32):6919-6932. doi: 10.1523/JNEUROSCI.1685-20.2021. Epub 2021 Jul 1.
4
Syllable or phoneme? A mouse-tracking investigation of phonological units in Mandarin Chinese and English spoken word recognition.音节还是音素?汉语普通话和英语口语词汇识别中语音单位的鼠标追踪研究。
J Exp Psychol Learn Mem Cogn. 2023 Jan;49(1):130-176. doi: 10.1037/xlm0001128. Epub 2022 Jun 9.
5
Time course of Chinese monosyllabic spoken word recognition: evidence from ERP analyses.汉语单音节词语音识别的时间进程:来自 ERP 分析的证据。
Neuropsychologia. 2011 Jun;49(7):1761-70. doi: 10.1016/j.neuropsychologia.2011.02.054. Epub 2011 Mar 4.
6
Low-frequency neural activity reflects rule-based chunking during speech listening.低频神经活动反映了言语听知觉中基于规则的组块化。
Elife. 2020 Apr 20;9:e55613. doi: 10.7554/eLife.55613.
7
Understanding environmental sounds in sentence context.理解句子语境中的环境声音。
Cognition. 2018 Mar;172:134-143. doi: 10.1016/j.cognition.2017.12.009. Epub 2017 Dec 19.
8
Extrinsic Cognitive Load Impairs Spoken Word Recognition in High- and Low-Predictability Sentences.外在认知负荷会影响高低预测度句子中的口语词汇识别。
Ear Hear. 2018 Mar/Apr;39(2):378-389. doi: 10.1097/AUD.0000000000000493.
9
Some Neurocognitive Correlates of Noise-Vocoded Speech Perception in Children With Normal Hearing: A Replication and Extension of ).听力正常儿童噪声-声码语音感知的一些神经认知关联:一项(研究的)复制与扩展 。 (注:原文括号部分不完整,翻译时保留原样)
Ear Hear. 2017 May/Jun;38(3):344-356. doi: 10.1097/AUD.0000000000000393.
10
Spoken Word Recognition of Chinese Words in Continuous Speech.连续语音中汉语词汇的语音识别
J Psycholinguist Res. 2015 Dec;44(6):775-87. doi: 10.1007/s10936-014-9318-2.

引用本文的文献

1
Decoupling speech processing from time.将语音处理与时间解耦。
Trends Cogn Sci. 2025 Jun 25. doi: 10.1016/j.tics.2025.05.017.
2
Maintenance of subcategorical information during speech perception: revisiting misunderstood limitations.言语感知过程中次范畴信息的维持:重新审视被误解的局限性
J Mem Lang. 2025 Feb;140. doi: 10.1016/j.jml.2024.104565. Epub 2024 Sep 20.
3
Encoding speech rate in challenging listening conditions: White noise and reverberation.在具有挑战性的聆听环境中编码语音率:白噪声和混响。
Atten Percept Psychophys. 2022 Oct;84(7):2303-2318. doi: 10.3758/s13414-022-02554-8. Epub 2022 Aug 22.
4
Neural dynamics differentially encode phrases and sentences during spoken language comprehension.在口语理解过程中,神经动力学对短语和句子进行了不同的编码。
PLoS Biol. 2022 Jul 14;20(7):e3001713. doi: 10.1371/journal.pbio.3001713. eCollection 2022 Jul.
5
Differential contributions of synaptic and intrinsic inhibitory currents to speech segmentation via flexible phase-locking in neural oscillators.通过神经振荡器的灵活相位锁定,突触和内在抑制电流对语音分段的差异贡献。
PLoS Comput Biol. 2021 Apr 14;17(4):e1008783. doi: 10.1371/journal.pcbi.1008783. eCollection 2021 Apr.