用于语音分离的时频掩蔽及其在助听器设计中的潜力。

Time-frequency masking for speech separation and its potential for hearing aid design.

机构信息

Department of Computer Science & Engineering, Center for Cognitive Science, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Trends Amplif. 2008 Dec;12(4):332-53. doi: 10.1177/1084713808326455. Epub 2008 Oct 30.

DOI:10.1177/1084713808326455

PMID:18974204

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4111459/

Abstract

A new approach to the separation of speech from speech-in-noise mixtures is the use of time-frequency (T-F) masking. Originated in the field of computational auditory scene analysis, T-F masking performs separation in the time-frequency domain. This article introduces the T-F masking concept and reviews T-F masking algorithms that separate target speech from either monaural or binaural mixtures, as well as microphone-array recordings. The review emphasizes techniques that are promising for hearing aid design. This article also surveys recent studies that evaluate the perceptual effects of T-F masking techniques, particularly their effectiveness in improving human speech recognition in noise. An assessment is made of the potential benefits of T-F masking methods for the hearing impaired in light of the processing constraints of hearing aids. Finally, several issues pertinent to T-F masking are discussed.

摘要

一种将语音从噪声中的语音混合信号中分离出来的新方法是使用时频（T-F）掩蔽。时频掩蔽起源于计算听觉场景分析领域，在时频域中进行分离。本文介绍了时频掩蔽概念，并回顾了将目标语音从单声道或双耳混合信号以及麦克风阵列录音中分离出来的时频掩蔽算法。该综述强调了对助听器设计有前景的技术。本文还调查了评估时频掩蔽技术感知效果的近期研究，特别是它们在改善噪声中人类语音识别方面的有效性。根据助听器的处理限制，对时频掩蔽方法对听力受损者的潜在益处进行了评估。最后，讨论了与时频掩蔽相关的几个问题。

相似文献

Time-frequency masking for speech separation and its potential for hearing aid design.

Trends Amplif. 2008 Dec;12(4):332-53. doi: 10.1177/1084713808326455. Epub 2008 Oct 30.

Comparison of single-microphone noise reduction schemes: can hearing impaired listeners tell the difference?

Int J Audiol. 2018 Jun;57(sup3):S55-S61. doi: 10.1080/14992027.2017.1279758. Epub 2017 Jan 23.

Benefit of Hearing-Aid Amplification and Signal Enhancement for Speech Reception in Complex Listening Situations.

Trends Hear. 2024 Jan-Dec;28:23312165241271407. doi: 10.1177/23312165241271407.

Speech quality evaluation of a sparse coding shrinkage noise reduction algorithm with normal hearing and hearing impaired listeners.

Hear Res. 2015 Sep;327:175-85. doi: 10.1016/j.heares.2015.07.019. Epub 2015 Jul 29.

Effect of spectral change enhancement for the hearing impaired using parameter values selected with a genetic algorithm.

J Acoust Soc Am. 2013 May;133(5):2910-20. doi: 10.1121/1.4799807.

Examination of a hybrid beamformer that preserves auditory spatial cues.

J Acoust Soc Am. 2017 Oct;142(4):EL369. doi: 10.1121/1.5007279.

Preliminary evaluation of a novel non-linear frequency compression scheme for use in children.

Int J Audiol. 2017 Dec;56(12):976-988. doi: 10.1080/14992027.2017.1358467. Epub 2017 Aug 29.

Improving word recognition in noise among hearing-impaired subjects with a single-channel cochlear noise-reduction algorithm.

J Acoust Soc Am. 2012 Sep;132(3):1718-31. doi: 10.1121/1.4739441.

On a reference-free speech quality estimator for hearing aids.

J Acoust Soc Am. 2013 May;133(5):EL412-8. doi: 10.1121/1.4802186.

Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners.

J Acoust Soc Am. 2017 Mar;141(3):1985. doi: 10.1121/1.4977197.

引用本文的文献

Client Oriented Scale of Improvement in First-Time and Experienced Hearing Aid Users: An Analysis of Five Predetermined Predictability Categories through Audiometric and Speech Testing.

J Clin Med. 2024 Jul 5;13(13):3956. doi: 10.3390/jcm13133956.

Parameter tuning of time-frequency masking algorithms for reverberant artifact removal within the cochlear implant stimulus.

Cochlear Implants Int. 2022 Nov;23(6):309-316. doi: 10.1080/14670100.2022.2096182. Epub 2022 Jul 23.

Harmonic Cancellation-A Fundamental of Auditory Scene Analysis.

Trends Hear. 2021 Jan-Dec;25:23312165211041422. doi: 10.1177/23312165211041422.

A convolutional recurrent neural network with attention framework for speech separation in monaural recordings.

Sci Rep. 2021 Jan 14;11(1):1434. doi: 10.1038/s41598-020-80713-3.

Supervised Speech Separation Based on Deep Learning: An Overview.

IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.

A Competing Voices Test for Hearing-Impaired Listeners Applied to Spatial Separation and Ideal Time-Frequency Masks.

Trends Hear. 2019 Jan-Dec;23:2331216519848288. doi: 10.1177/2331216519848288.

An ideal quantized mask to increase intelligibility and quality of speech in noise.

J Acoust Soc Am. 2018 Sep;144(3):1392. doi: 10.1121/1.5053115.

The Influence of Noise Reduction on Speech Intelligibility, Response Times to Speech, and Perceived Listening Effort in Normal-Hearing Listeners.

Trends Hear. 2017 Jan-Dec;21:2331216517716844. doi: 10.1177/2331216517716844.

Hearing impairment, cognition and speech understanding: exploratory factor analyses of a comprehensive test battery for a group of hearing aid users, the n200 study.

Int J Audiol. 2016 Nov;55(11):623-42. doi: 10.1080/14992027.2016.1219775. Epub 2016 Sep 2.

Working Memory and Hearing Aid Processing: Literature Findings, Future Directions, and Clinical Applications.

Front Psychol. 2015 Dec 16;6:1894. doi: 10.3389/fpsyg.2015.01894. eCollection 2015.

本文引用的文献

Speech intelligibility in background noise with ideal binary time-frequency masking.

J Acoust Soc Am. 2009 Apr;125(4):2336-47. doi: 10.1121/1.3083233.

Speech perception of noise with binary gains.

J Acoust Soc Am. 2008 Oct;124(4):2303-7. doi: 10.1121/1.2967865.

Segregation of unvoiced speech from nonspeech interference.

J Acoust Soc Am. 2008 Aug;124(2):1306-19. doi: 10.1121/1.2939132.

Effect of spectral resolution on the intelligibility of ideal binary masked speech.

J Acoust Soc Am. 2008 Apr;123(4):EL59-64. doi: 10.1121/1.2884086.

Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

J Acoust Soc Am. 2008 Mar;123(3):1673-82. doi: 10.1121/1.2832617.

Two-microphone separation of speech mixtures.

IEEE Trans Neural Netw. 2008 Mar;19(3):475-92. doi: 10.1109/TNN.2007.911740.

Separation of speech from interfering sounds based on oscillatory correlation.

IEEE Trans Neural Netw. 1999;10(3):684-97. doi: 10.1109/72.761727.

Monaural speech segregation based on pitch tracking and amplitude modulation.

IEEE Trans Neural Netw. 2004 Sep;15(5):1135-50. doi: 10.1109/TNN.2004.832812.

Speech enhancement using the modified phase-opponency model.

J Acoust Soc Am. 2007 Jun;121(6):3886-98. doi: 10.1121/1.2714913.

Binaural segregation in multisource reverberant environments.

J Acoust Soc Am. 2006 Dec;120(6):4040-51. doi: 10.1121/1.2355480.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于语音分离的时频掩蔽及其在助听器设计中的潜力。

Time-frequency masking for speech separation and its potential for hearing aid design.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献