Suppr超能文献

当前语音增强算法未能提高语音清晰度的原因及建议的解决方案。

Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions.

作者信息

Loizou Philipos C, Kim Gibak

机构信息

The authors are with the Department of Electrical Engineering, University of Texas at Dallas, Richardson, TX 75083-0688 USA (

出版信息

IEEE Trans Audio Speech Lang Process. 2011;19(1):47-56. doi: 10.1109/TASL.2010.2045180.

Abstract

Existing speech enhancement algorithms can improve speech quality but not speech intelligibility, and the reasons for that are unclear. In the present paper, we present a theoretical framework that can be used to analyze potential factors that can influence the intelligibility of processed speech. More specifically, this framework focuses on the fine-grain analysis of the distortions introduced by speech enhancement algorithms. It is hypothesized that if these distortions are properly controlled, then large gains in intelligibility can be achieved. To test this hypothesis, intelligibility tests are conducted with human listeners in which we present processed speech with controlled speech distortions. The aim of these tests is to assess the perceptual effect of the various distortions that can be introduced by speech enhancement algorithms on speech intelligibility. Results with three different enhancement algorithms indicated that certain distortions are more detrimental to speech intelligibility degradation than others. When these distortions were properly controlled, however, large gains in intelligibility were obtained by human listeners, even by spectral-subtractive algorithms which are known to degrade speech quality and intelligibility.

摘要

现有的语音增强算法可以提高语音质量,但不能提高语音清晰度,其原因尚不清楚。在本文中,我们提出了一个理论框架,可用于分析可能影响处理后语音清晰度的潜在因素。更具体地说,该框架专注于对语音增强算法引入的失真进行细粒度分析。据推测,如果这些失真得到适当控制,那么语音清晰度可以大幅提高。为了验证这一假设,我们对人类听众进行了清晰度测试,在测试中我们呈现了具有可控语音失真的处理后语音。这些测试的目的是评估语音增强算法可能引入的各种失真对语音清晰度的感知效果。使用三种不同增强算法的结果表明,某些失真比其他失真对语音清晰度下降的影响更大。然而,当这些失真得到适当控制时,人类听众的语音清晰度有了大幅提高,即使是那些已知会降低语音质量和清晰度的谱减法算法也是如此。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ebe/3169296/3da38c4bf12e/nihms318747f1.jpg

相似文献

2
Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。
Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.

引用本文的文献

9
Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离:综述
IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.
10
Efficacy of a Hearing Aid Noise Reduction Function.助听器降噪功能的效果。
Trends Hear. 2018 Jan-Dec;22:2331216518782839. doi: 10.1177/2331216518782839.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验