Suppr超能文献

[一种用于构音障碍语音识别的多尺度特征提取算法]

[A multiscale feature extraction algorithm for dysarthric speech recognition].

作者信息

Zhao Jianxing, Xue Peiyun, Bai Jing, Shi Chenkang, Yuan Bo, Shi Tongtong

机构信息

School of Information and Computer Science, Taiyuan University of Technology, Taiyuan 030024, P. R. China.

出版信息

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2023 Feb 25;40(1):44-50. doi: 10.7507/1001-5515.202205049.

Abstract

In this paper, we propose a multi-scale mel domain feature map extraction algorithm to solve the problem that the speech recognition rate of dysarthria is difficult to improve. We used the empirical mode decomposition method to decompose speech signals and extracted Fbank features and their first-order differences for each of the three effective components to construct a new feature map, which could capture details in the frequency domain. Secondly, due to the problems of effective feature loss and high computational complexity in the training process of single channel neural network, we proposed a speech recognition network model in this paper. Finally, training and decoding were performed on the public UA-Speech dataset. The experimental results showed that the accuracy of the speech recognition model of this method reached 92.77%. Therefore, the algorithm proposed in this paper can effectively improve the speech recognition rate of dysarthria.

摘要

在本文中,我们提出了一种多尺度梅尔域特征图提取算法,以解决构音障碍语音识别率难以提高的问题。我们采用经验模态分解方法对语音信号进行分解,并为三个有效分量中的每一个提取Fbank特征及其一阶差分,以构建一个能够捕捉频域细节的新特征图。其次,针对单通道神经网络训练过程中存在的有效特征丢失和计算复杂度高的问题,我们在本文中提出了一种语音识别网络模型。最后,在公开的UA-Speech数据集上进行训练和解码。实验结果表明,该方法的语音识别模型准确率达到了92.77%。因此,本文提出的算法能够有效提高构音障碍的语音识别率。

相似文献

1
3
Dysarthric Speech Enhancement Based on Convolution Neural Network.基于卷积神经网络的构音障碍语音增强。
Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:60-64. doi: 10.1109/EMBC48229.2022.9871531.

本文引用的文献

6
Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.
7
Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验