• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过振荡器神经网络中动态相干的生成实现灵活的元音识别:与说话者无关的元音识别。

Flexible vowel recognition by the generation of dynamic coherence in oscillator neural networks: speaker-independent vowel recognition.

作者信息

Liu F, Yamaguchi Y, Shimizu H

机构信息

Faculty of Pharmaceutical Sciences, University of Tokyo, Japan.

出版信息

Biol Cybern. 1994;71(2):105-14. doi: 10.1007/BF00197313.

DOI:10.1007/BF00197313
PMID:8068772
Abstract

We propose a new model for speaker-independent vowel recognition which uses the flexibility of the dynamic linking that results from the synchronization of oscillating neural units. The system consists of an input layer and three neural layers, which are referred to as the A-, B- and C-centers. The input signals are a time series of linear prediction (LPC) spectrum envelopes of auditory signals. At each time-window within the series, the A-center receives input signals and extracts local peaks of the spectrum envelope, i.e., formants, and encodes them into local groups of independent oscillations. Speaker-independent vowel characteristics are embedded as a connection matrix in the B-center according to statistical data of Japanese vowels. The associative interaction in the B-center and reciprocal interaction between the A- and B-centers selectively activate a vowel as a global synchronized pattern over two centers. The C-center evaluates the synchronized activities among the three formant regions to give the selective output of the category among the five Japanese vowels. Thus, a flexible ability of dynamical linking among features is achieved over the three centers. The capability in the present system was investigated for speaker-independent recognition of Japanese vowels. The system demonstrated a remarkable ability for the recognition of vowels very similar to that of human listeners, including misleading vowels. In addition, it showed stable recognition for unsteady input signals and robustness against background noise. The optimum condition of the frequency of oscillation is discussed in comparison with stimulus-dependent synchronizations observed in neurophysiological experiments of the cortex.

摘要

我们提出了一种用于非特定说话者元音识别的新模型,该模型利用了振荡神经单元同步所产生的动态链接的灵活性。该系统由一个输入层和三个神经层组成,这三个神经层分别被称为A中心、B中心和C中心。输入信号是听觉信号的线性预测(LPC)频谱包络的时间序列。在该序列内的每个时间窗口,A中心接收输入信号并提取频谱包络的局部峰值,即共振峰,并将它们编码为独立振荡的局部组。根据日语元音的统计数据,非特定说话者的元音特征作为连接矩阵嵌入到B中心。B中心的关联相互作用以及A中心和B中心之间的相互作用选择性地激活一个元音,使其在两个中心上作为全局同步模式。C中心评估三个共振峰区域之间的同步活动,以给出五个日语元音类别中的选择性输出。因此,在这三个中心上实现了特征之间灵活的动态链接能力。对本系统在非特定说话者日语元音识别方面的能力进行了研究。该系统表现出了与人类听众非常相似的识别元音的显著能力,包括容易混淆的元音。此外,它对不稳定的输入信号表现出稳定的识别能力,并且对背景噪声具有鲁棒性。与在皮层神经生理学实验中观察到的刺激依赖同步相比,讨论了振荡频率的最佳条件。

相似文献

1
Flexible vowel recognition by the generation of dynamic coherence in oscillator neural networks: speaker-independent vowel recognition.通过振荡器神经网络中动态相干的生成实现灵活的元音识别:与说话者无关的元音识别。
Biol Cybern. 1994;71(2):105-14. doi: 10.1007/BF00197313.
2
Speaker normalization using cortical strip maps: a neural model for steady-state vowel categorization.使用皮质带图的说话者归一化:一种用于稳态元音分类的神经模型。
J Acoust Soc Am. 2008 Dec;124(6):3918-36. doi: 10.1121/1.2997478.
3
The neural encoding of formant frequencies contributing to vowel identification in normal-hearing listeners.正常听力听众中有助于元音识别的共振峰频率的神经编码。
J Acoust Soc Am. 2016 Jan;139(1):1-11. doi: 10.1121/1.4931909.
4
Neuromagnetic correlates of voice pitch, vowel type, and speaker size in auditory cortex.听觉皮层中与音高、元音类型和说话人大小相关的神经磁。
Neuroimage. 2017 Sep;158:79-89. doi: 10.1016/j.neuroimage.2017.06.065. Epub 2017 Jun 29.
5
Perception of vowels and prosody by cochlear implant recipients in noise.人工耳蜗植入者在噪声环境中对元音和韵律的感知。
J Commun Disord. 2013 Sep-Dec;46(5-6):449-64. doi: 10.1016/j.jcomdis.2013.09.002. Epub 2013 Sep 21.
6
A perceptual model of vowel recognition based on the auditory representation of American English vowels.一种基于美式英语元音听觉表征的元音识别感知模型。
J Acoust Soc Am. 1986 Apr;79(4):1086-100. doi: 10.1121/1.393381.
7
Auditory nerve representation of vowels in background noise.背景噪声中元音的听神经表征。
J Neurophysiol. 1983 Jul;50(1):27-45. doi: 10.1152/jn.1983.50.1.27.
8
Vowel identification: orthographic, perceptual, and acoustic aspects.元音识别:正字法、感知和声学方面。
J Acoust Soc Am. 1982 Apr;71(4):975-89. doi: 10.1121/1.387579.
9
Formant pattern ambiguity of vowel sounds.元音的共振峰模式模糊性。
Int J Neurosci. 2000;100(1-4):39-76.
10
Dynamic specification of coarticulated German vowels: perceptual and acoustical studies.协同发音的德语元音的动态特征:感知与声学研究
J Acoust Soc Am. 1998 Jul;104(1):488-504. doi: 10.1121/1.423299.

本文引用的文献

1
The effect of response termination of the stimulus upon reaction time.刺激的反应终止对反应时间的影响。
J Comp Physiol Psychol. 1949 Oct;42(5):357-64. doi: 10.1037/h0062553.
2
Selective attention enhances the auditory 40-Hz transient response in humans.选择性注意增强人类听觉40赫兹瞬态反应。
Nature. 1993 Jul 1;364(6432):59-60. doi: 10.1038/364059a0.
3
A neural cocktail-party processor.一种神经鸡尾酒会处理器。
Biol Cybern. 1986;54(1):29-40. doi: 10.1007/BF00337113.
4
A perceptual model of vowel recognition based on the auditory representation of American English vowels.一种基于美式英语元音听觉表征的元音识别感知模型。
J Acoust Soc Am. 1986 Apr;79(4):1086-100. doi: 10.1121/1.393381.
5
Coherent oscillations: a mechanism of feature linking in the visual cortex? Multiple electrode and correlation analyses in the cat.相干振荡:视觉皮层中特征联结的一种机制?猫的多电极和相关性分析
Biol Cybern. 1988;60(2):121-30. doi: 10.1007/BF00202899.
6
Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties.猫视觉皮层中的振荡反应表现出柱间同步,这反映了整体刺激特性。
Nature. 1989 Mar 23;338(6213):334-7. doi: 10.1038/338334a0.
7
Reentrant signaling among simulated neuronal groups leads to coherency in their oscillatory activity.模拟神经元群体之间的折返信号导致其振荡活动的一致性。
Proc Natl Acad Sci U S A. 1989 Sep;86(18):7265-9. doi: 10.1073/pnas.86.18.7265.
8
Human auditory evoked gamma-band magnetic fields.人类听觉诱发伽马波段磁场。
Proc Natl Acad Sci U S A. 1991 Oct 15;88(20):8996-9000. doi: 10.1073/pnas.88.20.8996.
9
Magnetic field tomography of coherent thalamocortical 40-Hz oscillations in humans.人类丘脑皮质40赫兹相干振荡的磁场断层扫描
Proc Natl Acad Sci U S A. 1991 Dec 15;88(24):11037-41. doi: 10.1073/pnas.88.24.11037.
10
Sensory segmentation with coupled neural oscillators.基于耦合神经振荡器的感觉分割
Biol Cybern. 1992;67(3):233-42. doi: 10.1007/BF00204396.