Suppr超能文献

利用传入听觉系统的研究结果,通过人工神经网络进行语音识别。

Speech recognition by an artificial neural network using findings on the afferent auditory system.

作者信息

Kurogi S

机构信息

Division of Control Engineering, Kyushu Institute of Technology, Kitakyushu, Japan.

出版信息

Biol Cybern. 1991;64(3):243-9. doi: 10.1007/BF00201985.

Abstract

An artificial neural network which uses anatomical and physiological findings on the afferent pathway from the ear to the cortex is presented and the roles of the constituent functions in recognition of continuous speech are examined. The network deals with successive spectra of speech sounds by a cascade of several neural layers: lateral excitation layer (LEL), lateral inhibition layer (LIL), and a pile of feature detection layers (FDL's). These layers are shown to be effective for recognizing spoken words. Namely, first, LEL reduces the distortion of sound spectrum caused by the pitch of speech sounds. Next, LIL emphasizes the major energy peaks of sound spectrum, the formants. Last, FDL's detect syllables and words in successive formants, where two functions, time-delay and strong adaptation, play important roles: time-delay makes it possible to retain the pattern of formant changes for a period to detect spoken words successively; strong adaptation contributes to removing the time-warp of formant changes. Digital computer simulations show that the network detect isolated syllables, isolated words, and connected words in continuous speech, while reproducing the fundamental responses found in the auditory system such as ON, OFF, ON-OFF, and SUSTAINED patterns.

摘要

本文提出了一种人工神经网络,该网络利用从耳朵到皮层的传入通路上的解剖学和生理学发现,并研究了其组成功能在连续语音识别中的作用。该网络通过几个神经层的级联来处理连续的语音频谱:侧向兴奋层(LEL)、侧向抑制层(LIL)和一堆特征检测层(FDL)。这些层被证明对识别口语单词是有效的。具体来说,首先,LEL减少了由语音音高引起的声谱失真。其次,LIL强调了声谱的主要能量峰值,即共振峰。最后,FDL在连续的共振峰中检测音节和单词,其中时间延迟和强适应这两个功能起着重要作用:时间延迟使得能够在一段时间内保留共振峰变化的模式,以便连续检测口语单词;强适应有助于消除共振峰变化的时间扭曲。数字计算机模拟表明,该网络能够检测连续语音中的孤立音节、孤立单词和连续单词,同时重现听觉系统中发现的基本反应,如开、关、开-关和持续模式。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验