强大的语音感知：识别熟悉的内容，将其推广到相似的内容，并适应新的内容。

Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel.

作者信息

Kleinschmidt Dave F, Jaeger T Florian

机构信息

Department of Brain and Cognitive Sciences, University of Rochester.

Departments of Brain and Cognitive Sciences, Computer Science, and Linguistics, University of Rochester.

出版信息

Psychol Rev. 2015 Apr;122(2):148-203. doi: 10.1037/a0038695.

DOI:10.1037/a0038695

PMID:25844873

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4744792/

Abstract

Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker's /p/ might be physically indistinguishable from another talker's /b/ (cf. lack of invariance). We characterize the computational problem posed by such a subjectively nonstationary world and propose that the speech perception system overcomes this challenge by (a) recognizing previously encountered situations, (b) generalizing to other situations based on previous similar experience, and (c) adapting to novel situations. We formalize this proposal in the ideal adapter framework: (a) to (c) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on 2 critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires that listeners learn to represent the structured component of cross-situation variability in the speech signal. We discuss how these 2 aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.

摘要

成功的言语感知要求听众将声学信号映射到语言类别。这些映射不仅是概率性的，而且会根据情况而变化。例如，一个说话者的/p/在物理上可能与另一个说话者的/b/无法区分（参见缺乏不变性）。我们描述了由这样一个主观上非平稳的世界所带来的计算问题，并提出言语感知系统通过以下方式克服这一挑战：(a)识别先前遇到的情况，(b)根据先前的类似经验推广到其他情况，以及(c)适应新情况。我们在理想适配器框架中形式化了这一建议：(a)至(c)可以理解为在关于当前说话者的适当生成模型的不确定性下进行推理，从而尽管缺乏不变性仍能促进稳健的言语感知。我们关注理想适配器的两个关键方面。首先，在明显偏离先前经验的情况下，听众需要进行适应。我们开发了一种增量适应的分布（信念更新）学习模型。该模型与已知和新的语音适应数据，包括感知重新校准和选择性适应，拟合良好。其次，稳健的语音识别要求听众学会表征语音信号中跨情况变异性的结构化成分。我们讨论了理想适配器的这两个方面如何为适应、说话者特异性以及跨说话者和说话者群体（例如口音和方言）的泛化提供统一的解释。理想适配器为未来对言语感知和适应以及更广泛的语言理解的研究提供了一个指导框架。

相似文献

Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel.强大的语音感知：识别熟悉的内容，将其推广到相似的内容，并适应新的内容。

Psychol Rev. 2015 Apr;122(2):148-203. doi: 10.1037/a0038695.

Distributional learning for speech reflects cumulative exposure to a talker's phonetic distributions.语音的分布学习反映了对说话者语音分布的累积接触。

Psychon Bull Rev. 2019 Jun;26(3):985-992. doi: 10.3758/s13423-018-1551-5.

Talker familiarity and the accommodation of talker variability.说话人熟悉度与说话人变异性的顺应。

Atten Percept Psychophys. 2021 May;83(4):1842-1860. doi: 10.3758/s13414-020-02203-y. Epub 2021 Jan 4.

Sociolinguistic Perception as Inference Under Uncertainty.社会语言学感知作为不确定性下的推理。

Top Cogn Sci. 2018 Oct;10(4):818-834. doi: 10.1111/tops.12331. Epub 2018 Mar 15.

Lexically guided phonetic retuning of foreign-accented speech and its generalization.词汇引导的外国口音语音调整及其泛化。

J Exp Psychol Hum Percept Perform. 2014 Apr;40(2):539-55. doi: 10.1037/a0034409. Epub 2013 Sep 23.

Talker-specific learning in speech perception.语音感知中的说话者特定学习。

Percept Psychophys. 1998 Apr;60(3):355-76. doi: 10.3758/bf03206860.

Accent-independent adaptation to foreign accented speech.口音独立性自适应于外国口音的语音。

J Acoust Soc Am. 2013 Mar;133(3):EL174-80. doi: 10.1121/1.4789864.

Perceptual adaptation to non-native speech.对非母语语音的感知适应。

Cognition. 2008 Feb;106(2):707-29. doi: 10.1016/j.cognition.2007.04.005. Epub 2007 May 29.

Talker variability in audio-visual speech perception.视听语音感知中的说话人变异性。

Front Psychol. 2014 Jul 16;5:698. doi: 10.3389/fpsyg.2014.00698. eCollection 2014.

Talker-specific influences on phonetic category structure.说话者对语音范畴结构的特定影响。

J Acoust Soc Am. 2015 Aug;138(2):1068-78. doi: 10.1121/1.4927489.

引用本文的文献

In the Words of Others: ERP Evidence of Speaker-Specific Phonological Prediction.他人之言：说话者特定语音预测的事件相关电位证据

Psychophysiology. 2025 Sep;62(9):e70135. doi: 10.1111/psyp.70135.

Prediction efficiency and incremental processing strategy during spoken language comprehension in autistic children: an eye-tracking study.自闭症儿童口语理解过程中的预测效率与增量加工策略：一项眼动追踪研究

Mol Autism. 2025 Aug 4;16(1):39. doi: 10.1186/s13229-025-00674-0.

Can Informativity Effects Be Predictability Effects in Disguise?信息性效应会是变相的可预测性效应吗？

Entropy (Basel). 2025 Jul 10;27(7):739. doi: 10.3390/e27070739.

Multiple timescales of context influence perceptual sensitivity to common pairings of musical pitch and timbre.上下文的多个时间尺度会影响对音高和音色常见组合的感知敏感性。

PLoS One. 2025 Jul 18;20(7):e0328490. doi: 10.1371/journal.pone.0328490. eCollection 2025.

Beating stress: Evidence for recalibration of word stress perception.战胜压力：单词重音感知重新校准的证据。

Atten Percept Psychophys. 2025 May 20. doi: 10.3758/s13414-025-03088-5.

Cents and shenshibility: The role of reward in talker-specific phonetic recalibration.金钱与情感：奖励在特定说话者语音重新校准中的作用。

Atten Percept Psychophys. 2025 Apr 11. doi: 10.3758/s13414-025-03048-z.

SingleMALD: Investigating practice effects in auditory lexical decision.单通道听觉词汇判定任务中的练习效应研究

Behav Res Methods. 2025 Apr 2;57(5):136. doi: 10.3758/s13428-025-02628-z.

Neuroimaging Findings for the Overnight Consolidation of Learned Non-native Speech Sounds.习得的非母语语音夜间巩固的神经影像学研究结果

Neurobiol Lang (Camb). 2025 Jan 10;6. doi: 10.1162/nol_a_00157. eCollection 2025.

Cognitive Predictors of Perception and Adaption to Dysarthric Speech in Older Adults.老年人对构音障碍言语的感知与适应的认知预测因素

J Speech Lang Hear Res. 2025 Jul 29;68(7S):3507-3524. doi: 10.1044/2024_JSLHR-24-00345. Epub 2025 Jan 7.

Linguistic diversity shapes flexible speech perception in school age children.语言多样性塑造了学龄儿童灵活的言语感知能力。

Sci Rep. 2024 Nov 21;14(1):28825. doi: 10.1038/s41598-024-80430-1.

本文引用的文献

Learning Additional Languages as Hierarchical Probabilistic Inference: Insights From First Language Processing.将学习其他语言作为分层概率推理：来自第一语言处理的见解。

Lang Learn. 2016 Dec;66(4):900-944. doi: 10.1111/lang.12168. Epub 2016 Mar 14.

Incremental implicit learning of bundles of statistical patterns.统计模式束的增量隐式学习

Cognition. 2016 Dec;157:156-173. doi: 10.1016/j.cognition.2016.09.002. Epub 2016 Sep 15.

The role of training structure in perceptual learning of accented speech.训练结构在带口音语音感知学习中的作用。

J Exp Psychol Hum Percept Perform. 2016 Nov;42(11):1793-1805. doi: 10.1037/xhp0000260. Epub 2016 Jul 11.

Predictive coding.预测编码。

Wiley Interdiscip Rev Cogn Sci. 2011 Sep;2(5):580-593. doi: 10.1002/wcs.142. Epub 2011 Mar 24.

Immediate effects of anticipatory coarticulation in spoken-word recognition.口语单词识别中预期协同发音的即时效应。

J Mem Lang. 2014 Feb 1;71(1):145-163. doi: 10.1016/j.jml.2013.11.002.

Speech perception under adverse conditions: insights from behavioral, computational, and neuroscience research.在不利条件下的言语感知：来自行为、计算和神经科学研究的见解。

Front Syst Neurosci. 2014 Jan 3;7:126. doi: 10.3389/fnsys.2013.00126. eCollection 2014.

Implicit schemata and categories in memory-based language processing.基于记忆的语言处理中的隐性图式和类别。

Lang Speech. 2013 Sep;56(Pt 3):309-28. doi: 10.1177/0023830913484902.

The P-chain: relating sentence production and its disorders to comprehension and acquisition.P 链：将句子生成及其障碍与理解和习得联系起来。

Philos Trans R Soc Lond B Biol Sci. 2013 Dec 9;369(1634):20120394. doi: 10.1098/rstb.2012.0394. Print 2014.

A role for the developing lexicon in phonetic category acquisition.发展中的词汇在语音范畴习得中的作用。

Psychol Rev. 2013 Oct;120(4):751-78. doi: 10.1037/a0034245.

Rapid Expectation Adaptation during Syntactic Comprehension.快速的预期适应在句法理解中。

PLoS One. 2013 Oct 30;8(10):e77661. doi: 10.1371/journal.pone.0077661. eCollection 2013.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验