Suppr超能文献

使用神经图估计先验信噪比以进行语音增强。

Estimation of a priori signal-to-noise ratio using neurograms for speech enhancement.

作者信息

Jassim Wissam A, Harte Naomi

机构信息

Sigmedia Group, ADAPT Centre, School of Engineering, Trinity College Dublin, Ireland.

出版信息

J Acoust Soc Am. 2020 Jun;147(6):3830. doi: 10.1121/10.0001324.

Abstract

In statistical-based speech enhancement algorithms, the a priori signal-to-noise ratio (SNR) must be estimated to calculate the required spectral gain function. This paper proposes a method to improve this estimation using features derived from the neural responses of the auditory-nerve (AN) system. The neural responses, interpreted as a neurogram (NG), are simulated for noisy speech using a computational model of the AN system with a range of characteristic frequencies (CFs). Two machine learning algorithms were explored to train the estimation model based on NG features: support vector regression and a convolutional neural network. The proposed estimator was placed in a common speech enhancement system, and three conventional spectral gain functions were employed to estimate the enhanced signal. The proposed method was tested using the NOIZEUS database at different SNR levels, and various speech quality and intelligibility measures were employed for performance evaluation. The a priori SNR estimated from NG features achieved better quality and intelligibility scores than that of recent estimators, especially for highly distorted speech and low SNR values.

摘要

在基于统计的语音增强算法中,必须估计先验信噪比(SNR)以计算所需的频谱增益函数。本文提出了一种利用从听觉神经(AN)系统的神经反应中提取的特征来改进这种估计的方法。使用具有一系列特征频率(CFs)的AN系统计算模型,对噪声语音的神经反应进行模拟,将其解释为神经图(NG)。探索了两种机器学习算法,基于NG特征训练估计模型:支持向量回归和卷积神经网络。将所提出的估计器置于通用语音增强系统中,并采用三种传统频谱增益函数来估计增强信号。使用NOIZEUS数据库在不同SNR水平下对所提出的方法进行测试,并采用各种语音质量和可懂度度量进行性能评估。从NG特征估计的先验SNR比最近的估计器获得了更好的质量和可懂度分数,特别是对于高度失真的语音和低SNR值。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验