Suppr超能文献

时频掩蔽在背景噪声下改善构音障碍语音可懂度的应用。

The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise.

机构信息

Department of Communicative Disorders and Deaf Education, Utah State University, Logan.

Department of Speech and Hearing Science, The Ohio State University, Columbus.

出版信息

J Speech Lang Hear Res. 2023 May 9;66(5):1853-1866. doi: 10.1044/2023_JSLHR-22-00558. Epub 2023 Mar 21.

Abstract

PURPOSE

Background noise reduces speech intelligibility. Time-frequency (T-F) masking is an established signal processing technique that improves intelligibility of neurotypical speech in background noise. Here, we investigated a novel application of T-F masking, assessing its potential to improve intelligibility of neurologically degraded speech in background noise.

METHOD

Listener participants ( = 422) completed an intelligibility task either in the laboratory or online, listening to and transcribing audio recordings of neurotypical (control) and neurologically degraded (dysarthria) speech under three different processing types: speech in quiet (quiet), speech mixed with cafeteria noise (noise), and speech mixed with cafeteria noise and then subsequently processed by an ideal quantized mask (IQM) to remove the noise.

RESULTS

We observed significant reductions in intelligibility of dysarthric speech, even at highly favorable signal-to-noise ratios (+11 to +23 dB) that did not impact neurotypical speech. We also observed significant intelligibility improvements from speech in noise to IQM-processed speech for both control and dysarthric speech across a wide range of noise levels. Furthermore, the overall benefit of IQM processing for dysarthric speech was comparable with that of the control speech in background noise, as was the intelligibility data collected in the laboratory versus online.

CONCLUSIONS

This study demonstrates proof of concept, validating the application of T-F masks to a neurologically degraded speech signal. Given that intelligibility challenges greatly impact communication, and thus the lives of people with dysarthria and their communication partners, the development of clinical tools to enhance intelligibility in this clinical population is critical.

摘要

目的

背景噪声会降低言语可懂度。时频(T-F)掩蔽是一种已确立的信号处理技术,可提高背景噪声中神经典型语音的可懂度。在这里,我们研究了 T-F 掩蔽的一种新应用,评估其在背景噪声中改善神经受损语音可懂度的潜力。

方法

聆听者参与者(n=422)在实验室或在线完成了一项可懂度任务,他们聆听并转录了神经典型(对照)和神经受损(构音障碍)语音的音频记录,这些语音在三种不同的处理类型下进行:安静环境下的语音(安静)、与自助餐厅噪声混合的语音(噪声),以及与自助餐厅噪声混合后通过理想量化掩蔽(IQM)处理以去除噪声的语音。

结果

我们观察到构音障碍语音的可懂度显著降低,即使在对神经典型语音没有影响的高度有利信噪比(+11 到+23 dB)下也是如此。我们还观察到,对于对照语音和构音障碍语音,无论是在何种噪声水平下,从噪声中的语音到 IQM 处理后的语音,都能显著提高可懂度。此外,IQM 处理对构音障碍语音的总体益处与背景噪声中对照语音的可懂度相当,在实验室和在线收集的可懂度数据也是如此。

结论

本研究证明了概念验证,验证了 T-F 掩蔽在神经受损语音信号中的应用。鉴于可懂度挑战极大地影响了交流,进而影响了构音障碍患者及其交流伙伴的生活,因此开发用于增强该临床人群可懂度的临床工具至关重要。

相似文献

1
The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise.
J Speech Lang Hear Res. 2023 May 9;66(5):1853-1866. doi: 10.1044/2023_JSLHR-22-00558. Epub 2023 Mar 21.
2
The Influence of Sensorineural Hearing Loss on the Relationship Between the Perception of Speech in Noise and Dysarthric Speech.
J Speech Lang Hear Res. 2023 Oct 4;66(10):4025-4036. doi: 10.1044/2023_JSLHR-23-00115. Epub 2023 Aug 31.
4
An ideal quantized mask to increase intelligibility and quality of speech in noise.
J Acoust Soc Am. 2018 Sep;144(3):1392. doi: 10.1121/1.5053115.
5
A relationship between processing speech in noise and dysarthric speech.
J Acoust Soc Am. 2017 Jun;141(6):4660. doi: 10.1121/1.4986746.
6
Intelligibility of dysarthric speech: perceptions of speakers and listeners.
Int J Lang Commun Disord. 2008 Nov-Dec;43(6):633-48. doi: 10.1080/13682820801887117.
7
Beyond Speech Intelligibility: Quantifying Behavioral and Perceived Listening Effort in Response to Dysarthric Speech.
J Speech Lang Hear Res. 2022 Nov 17;65(11):4060-4070. doi: 10.1044/2022_JSLHR-22-00136. Epub 2022 Oct 5.
8
Effects of listeners' working memory and noise on speech intelligibility in dysarthria.
Clin Linguist Phon. 2014 Oct;28(10):785-95. doi: 10.3109/02699206.2014.904443. Epub 2014 Apr 8.
9
Speech intelligibility in background noise with ideal binary time-frequency masking.
J Acoust Soc Am. 2009 Apr;125(4):2336-47. doi: 10.1121/1.3083233.
10
A Perceptual Learning Approach for Dysarthria Remediation: An Updated Review.
J Speech Lang Hear Res. 2021 Aug 9;64(8):3060-3073. doi: 10.1044/2021_JSLHR-21-00012. Epub 2021 Jul 21.

引用本文的文献

本文引用的文献

1
From Speech Acoustics to Communicative Participation in Dysarthria: Toward a Causal Framework.
J Speech Lang Hear Res. 2022 Feb 9;65(2):405-418. doi: 10.1044/2021_JSLHR-21-00306. Epub 2021 Dec 27.
5
Crowdsourcing as a tool in the clinical assessment of intelligibility in dysarthria: How to deal with excessive variation.
J Commun Disord. 2021 Sep-Oct;93:106135. doi: 10.1016/j.jcomdis.2021.106135. Epub 2021 Jun 17.
7
Effects of Vocabulary and Implicit Linguistic Knowledge on Speech Recognition in Adverse Listening Conditions.
Am J Audiol. 2019 Oct 16;28(3S):742-755. doi: 10.1044/2019_AJA-HEAL18-18-0169.
8
Autoscore: An open-source automated tool for scoring listener perception of speech.
J Acoust Soc Am. 2019 Jan;145(1):392. doi: 10.1121/1.5087276.
9
An ideal quantized mask to increase intelligibility and quality of speech in noise.
J Acoust Soc Am. 2018 Sep;144(3):1392. doi: 10.1121/1.5053115.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验