• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

语音感知中视听线索整合发展的建模

Modeling the Development of Audiovisual Cue Integration in Speech Perception.

作者信息

Getz Laura M, Nordeen Elke R, Vrabic Sarah C, Toscano Joseph C

机构信息

Department of Psychology, Villanova University, Villanova, PA 19085, USA.

出版信息

Brain Sci. 2017 Mar 21;7(3):32. doi: 10.3390/brainsci7030032.

DOI:10.3390/brainsci7030032
PMID:28335558
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5366831/
Abstract

Adult speech perception is generally enhanced when information is provided from multiple modalities. In contrast, infants do not appear to benefit from combining auditory and visual speech information early in development. This is true despite the fact that both modalities are important to speech comprehension even at early stages of language acquisition. How then do listeners learn how to process auditory and visual information as part of a unified signal? In the auditory domain, statistical learning processes provide an excellent mechanism for acquiring phonological categories. Is this also true for the more complex problem of acquiring audiovisual correspondences, which require the learner to integrate information from multiple modalities? In this paper, we present simulations using Gaussian mixture models (GMMs) that learn cue weights and combine cues on the basis of their distributional statistics. First, we simulate the developmental process of acquiring phonological categories from auditory and visual cues, asking whether simple statistical learning approaches are sufficient for learning multi-modal representations. Second, we use this time course information to explain audiovisual speech perception in adult perceivers, including cases where auditory and visual input are mismatched. Overall, we find that domain-general statistical learning techniques allow us to model the developmental trajectory of audiovisual cue integration in speech, and in turn, allow us to better understand the mechanisms that give rise to unified percepts based on multiple cues.

摘要

当从多种模态提供信息时,成人的言语感知通常会得到增强。相比之下,婴儿在发育早期似乎无法从结合听觉和视觉言语信息中受益。尽管在语言习得的早期阶段,这两种模态对言语理解都很重要,但情况确实如此。那么,听众是如何学习将听觉和视觉信息作为统一信号的一部分进行处理的呢?在听觉领域,统计学习过程为获取音系范畴提供了一种出色的机制。对于获取视听对应关系这个更复杂的问题,情况也是如此吗?获取视听对应关系要求学习者整合来自多种模态的信息。在本文中,我们展示了使用高斯混合模型(GMM)的模拟,该模型学习线索权重并根据其分布统计信息组合线索。首先,我们模拟从听觉和视觉线索中获取音系范畴的发育过程,探讨简单的统计学习方法是否足以学习多模态表征。其次,我们利用这个时间进程信息来解释成年感知者的视听言语感知,包括听觉和视觉输入不匹配的情况。总体而言,我们发现通用领域的统计学习技术使我们能够对言语中视听线索整合的发育轨迹进行建模,进而使我们能够更好地理解基于多种线索产生统一感知的机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/0a8c2ab7e444/brainsci-07-00032-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/f85b16b2d9c4/brainsci-07-00032-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/ba14617b0ee6/brainsci-07-00032-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/f8b0c2a0c193/brainsci-07-00032-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/1908b24c5865/brainsci-07-00032-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/fc86160ddb77/brainsci-07-00032-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/d7002baabfc8/brainsci-07-00032-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/ca974c3ae4ca/brainsci-07-00032-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/636e99a8a35e/brainsci-07-00032-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/2e6752a8537c/brainsci-07-00032-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/0a8c2ab7e444/brainsci-07-00032-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/f85b16b2d9c4/brainsci-07-00032-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/ba14617b0ee6/brainsci-07-00032-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/f8b0c2a0c193/brainsci-07-00032-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/1908b24c5865/brainsci-07-00032-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/fc86160ddb77/brainsci-07-00032-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/d7002baabfc8/brainsci-07-00032-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/ca974c3ae4ca/brainsci-07-00032-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/636e99a8a35e/brainsci-07-00032-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/2e6752a8537c/brainsci-07-00032-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0b/5366831/0a8c2ab7e444/brainsci-07-00032-g010.jpg

相似文献

1
Modeling the Development of Audiovisual Cue Integration in Speech Perception.语音感知中视听线索整合发展的建模
Brain Sci. 2017 Mar 21;7(3):32. doi: 10.3390/brainsci7030032.
2
Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics.线索与类别整合:使用无监督学习和分布统计对语音中的声学线索进行加权。
Cogn Sci. 2010 Apr;34(3):434-464. doi: 10.1111/j.1551-6709.2009.01077.x.
3
Degradation of labial information modifies audiovisual speech perception in cochlear-implanted children.唇语信息的退化改变了植入人工耳蜗的儿童对视听语音的感知。
Ear Hear. 2013 Jan-Feb;34(1):110-21. doi: 10.1097/AUD.0b013e3182670993.
4
Neural Mechanisms Underlying Cross-Modal Phonetic Encoding.跨模态语音编码的神经机制。
J Neurosci. 2018 Feb 14;38(7):1835-1849. doi: 10.1523/JNEUROSCI.1566-17.2017. Epub 2017 Dec 20.
5
Discriminating Non-native Vowels on the Basis of Multimodal, Auditory or Visual Information: Effects on Infants' Looking Patterns and Discrimination.基于多模态、听觉或视觉信息辨别非母语元音:对婴儿注视模式和辨别的影响。
Front Psychol. 2016 Apr 19;7:525. doi: 10.3389/fpsyg.2016.00525. eCollection 2016.
6
Prediction and constraint in audiovisual speech perception.视听言语感知中的预测与约束
Cortex. 2015 Jul;68:169-81. doi: 10.1016/j.cortex.2015.03.006. Epub 2015 Mar 20.
7
Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit.视听言语感知益处背后机制的发展
Brain Sci. 2021 Jan 5;11(1):49. doi: 10.3390/brainsci11010049.
8
A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.一种因果推理模型解释了麦格克效应及其他不一致视听言语的感知。
PLoS Comput Biol. 2017 Feb 16;13(2):e1005229. doi: 10.1371/journal.pcbi.1005229. eCollection 2017 Feb.
9
Neural initialization of audiovisual integration in prereaders at varying risk for developmental dyslexia.发育性阅读障碍风险各异的学前儿童视听整合的神经初始化
Hum Brain Mapp. 2017 Feb;38(2):1038-1055. doi: 10.1002/hbm.23437. Epub 2016 Oct 14.
10
Learning to match auditory and visual speech cues: social influences on acquisition of phonological categories.学习匹配听觉和视觉言语线索:社会因素对音位范畴习得的影响
Child Dev. 2015 Mar-Apr;86(2):362-78. doi: 10.1111/cdev.12320. Epub 2014 Nov 18.

引用本文的文献

1
A neural network model of the effect of prior experience with regularities on subsequent category learning.一种基于先前规律性经验对后续类别学习影响的神经网络模型。
Cognition. 2022 May;222:104997. doi: 10.1016/j.cognition.2021.104997. Epub 2022 Jan 7.
2
Acoustic noise and vision differentially warp the auditory categorization of speech.声噪声和视觉会使言语的听觉分类产生不同程度的扭曲。
J Acoust Soc Am. 2019 Jul;146(1):60. doi: 10.1121/1.5114822.

本文引用的文献

1
Newborns' sensitivity to the visual aspects of infant-directed speech: Evidence from point-line displays of talking faces.新生儿对婴儿指向性言语视觉方面的敏感性:来自会说话面孔的点线显示的证据。
J Exp Psychol Hum Percept Perform. 2016 Sep;42(9):1275-81. doi: 10.1037/xhp0000208. Epub 2016 Apr 28.
2
Audiovisual speech perception development at varying levels of perceptual processing.不同感知处理水平下的视听言语感知发展
J Acoust Soc Am. 2016 Apr;139(4):1713. doi: 10.1121/1.4945590.
3
The temporal binding window for audiovisual speech: Children are like little adults.
视听言语的时间绑定窗口:儿童如同小大人。
Neuropsychologia. 2016 Jul 29;88:74-82. doi: 10.1016/j.neuropsychologia.2016.02.017. Epub 2016 Feb 23.
4
The early maximum likelihood estimation model of audiovisual integration in speech perception.语音感知中视听整合的早期最大似然估计模型。
J Acoust Soc Am. 2015 May;137(5):2884-91. doi: 10.1121/1.4916691.
5
Perception of the multisensory coherence of fluent audiovisual speech in infancy: its emergence and the role of experience.婴儿期对流畅视听言语多感官连贯性的感知:其出现及经验的作用。
J Exp Child Psychol. 2015 Feb;130:147-62. doi: 10.1016/j.jecp.2014.10.006. Epub 2014 Nov 11.
6
Motherese by eye and ear: infants perceive visual prosody in point-line displays of talking heads.通过眼睛和耳朵的儿语:婴儿在会说话的头像的点线显示中感知视觉韵律。
PLoS One. 2014 Oct 29;9(10):e111467. doi: 10.1371/journal.pone.0111467. eCollection 2014.
7
Audiovisual integration in children listening to spectrally degraded speech.儿童在听频谱退化语音时的视听整合
J Speech Lang Hear Res. 2015 Feb;58(1):61-8. doi: 10.1044/2014_JSLHR-S-14-0044.
8
Infant perception of audio-visual speech synchrony in familiar and unfamiliar fluent speech.婴儿对熟悉和不熟悉的流利语音中视听语音同步性的感知。
Acta Psychol (Amst). 2014 Jun;149:142-7. doi: 10.1016/j.actpsy.2013.12.013. Epub 2014 Feb 25.
9
Audio-visual speech perception: a developmental ERP investigation.视听言语感知:发展事件相关电位研究。
Dev Sci. 2014 Jan;17(1):110-24. doi: 10.1111/desc.12098. Epub 2013 Oct 31.
10
Degrading phonetic information affects matching of audiovisual speech in adults, but not in infants.降解语音信息会影响成年人视听言语的匹配,但不会影响婴儿。
Cognition. 2014 Jan;130(1):31-43. doi: 10.1016/j.cognition.2013.09.006. Epub 2013 Oct 18.