Suppr超能文献

使用通用回归神经网络改进语音反转

Improved speech inversion using general regression neural network.

作者信息

Najnin Shamima, Banerjee Bonny

机构信息

Institute for Intelligent Systems, and Department of Electrical and Computer Engineering, 3815 Central Avenue, The University of Memphis, Memphis, Tennessee 38152, USA

出版信息

J Acoust Soc Am. 2015 Sep;138(3):EL229-35. doi: 10.1121/1.4929626.

Abstract

The problem of nonlinear acoustic to articulatory inversion mapping is investigated in the feature space using two models, the deep belief network (DBN) which is the state-of-the-art, and the general regression neural network (GRNN). The task is to estimate a set of articulatory features for improved speech recognition. Experiments with MOCHA-TIMIT and MNGU0 databases reveal that, for speech inversion, GRNN yields a lower root-mean-square error and a higher correlation than DBN. It is also shown that conjunction of acoustic and GRNN-estimated articulatory features yields state-of-the-art accuracy in broad class phonetic classification and phoneme recognition using less computational power.

摘要

在特征空间中,使用两种模型——最先进的深度信念网络(DBN)和广义回归神经网络(GRNN),研究了非线性声学到发音反演映射的问题。任务是估计一组发音特征以改进语音识别。使用MOCHA-TIMIT和MNGU0数据库进行的实验表明,对于语音反演,GRNN比DBN产生更低的均方根误差和更高的相关性。研究还表明,将声学特征与GRNN估计的发音特征相结合,在宽类语音分类和音素识别中,使用更少的计算能力就能产生最先进的准确率。

相似文献

9
Neural networks for improved text-independent speaker identification.
IEEE Eng Med Biol Mag. 2002 Mar-Apr;21(2):53-8. doi: 10.1109/memb.2002.1000186.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验