Suppr超能文献

使用听觉样滤波器组处理语音信号可以最大程度地减少对发音动作的不确定性。

Processing speech signal using auditory-like filterbank provides least uncertainty about articulatory gestures.

机构信息

Signal Analysis and Interpretation Laboratory, Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089, USA.

出版信息

J Acoust Soc Am. 2011 Jun;129(6):4014-22. doi: 10.1121/1.3573987.

Abstract

Understanding how the human speech production system is related to the human auditory system has been a perennial subject of inquiry. To investigate the production-perception link, in this paper, a computational analysis has been performed using the articulatory movement data obtained during speech production with concurrently recorded acoustic speech signals from multiple subjects in three different languages: English, Cantonese, and Georgian. The form of articulatory gestures during speech production varies across languages, and this variation is considered to be reflected in the articulatory position and kinematics. The auditory processing of the acoustic speech signal is modeled by a parametric representation of the cochlear filterbank which allows for realizing various candidate filterbank structures by changing the parameter value. Using mathematical communication theory, it is found that the uncertainty about the articulatory gestures in each language is maximally reduced when the acoustic speech signal is represented using the output of a filterbank similar to the empirically established cochlear filterbank in the human auditory system. Possible interpretations of this finding are discussed.

摘要

理解人类言语产生系统与人类听觉系统的关系一直是一个长期的研究课题。为了研究产生-感知的联系,本文使用来自三个不同语言(英语、粤语和格鲁吉亚语)的多个主体在言语产生过程中获得的发音运动数据和同时记录的声学言语信号进行了计算分析。言语产生过程中的发音动作形式因语言而异,这种变化被认为反映在发音位置和运动学上。通过对耳蜗滤波器组的参数表示来模拟声学言语信号的听觉处理,通过改变参数值可以实现各种候选滤波器组结构。使用数学通信理论,发现当使用类似于人类听觉系统中经验建立的耳蜗滤波器组的滤波器组的输出来表示声学言语信号时,每种语言的发音动作的不确定性被最大程度地降低。讨论了对这一发现的可能解释。

相似文献

10
Processing of changes in visual speech in the human auditory cortex.人类听觉皮层中视觉言语变化的处理。
Brain Res Cogn Brain Res. 2002 May;13(3):417-25. doi: 10.1016/s0926-6410(02)00053-8.

本文引用的文献

1
Bark frequency transform using an arbitrary order allpass filter.使用任意阶全通滤波器的 Bark 频率变换。
IEEE Signal Process Lett. 2010 Mar;17(6):543-546. doi: 10.1109/LSP.2010.2046192.
2
Efficient auditory coding.高效听觉编码
Nature. 2006 Feb 23;439(7079):978-82. doi: 10.1038/nature04485.
4
The molecular architecture of the inner ear.内耳的分子结构。
Br Med Bull. 2002;63:5-24. doi: 10.1093/bmb/63.1.5.
5
Efficient coding of natural sounds.自然声音的高效编码。
Nat Neurosci. 2002 Apr;5(4):356-63. doi: 10.1038/nn831.
8
Perception of the speech code.语音编码的感知
Psychol Rev. 1967 Nov;74(6):431-61. doi: 10.1037/h0020279.
9
The motor theory of speech perception revised.言语知觉的运动理论修正版。
Cognition. 1985 Oct;21(1):1-36. doi: 10.1016/0010-0277(85)90021-6.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验