Department of Psychology, University of Cambridge, Cambridge, United Kingdom.
MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom.
J Acoust Soc Am. 2019 Mar;145(3):1493. doi: 10.1121/1.5094765.
The effects on speech intelligibility and sound quality of two noise-reduction algorithms were compared: a deep recurrent neural network (RNN) and spectral subtraction (SS). The RNN was trained using sentences spoken by a large number of talkers with a variety of accents, presented in babble. Different talkers were used for testing. Participants with mild-to-moderate hearing loss were tested. Stimuli were given frequency-dependent linear amplification to compensate for the individual hearing losses. A paired-comparison procedure was used to compare all possible combinations of three conditions. The conditions were: speech in babble with no processing (NP) or processed using the RNN or SS. In each trial, the same sentence was played twice using two different conditions. The participants indicated which one was better and by how much in terms of speech intelligibility and (in separate blocks) sound quality. Processing using the RNN was significantly preferred over NP and over SS processing for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. SS processing was not significantly preferred over NP for either subjective intelligibility or sound quality. Objective computational measures of speech intelligibility predicted better intelligibility for RNN than for SS or NP.
比较了两种降噪算法(深度递归神经网络(RNN)和谱减法(SS))对语音可懂度和音质的影响:RNN 使用大量具有各种口音的说话者在嘈杂环境中所说的句子进行训练,然后对不同的说话者进行测试。参与者的听力损失为轻度至中度。使用频率相关的线性放大来补偿个体听力损失。使用配对比较程序比较了三种条件的所有可能组合。条件是:嘈杂环境中未经处理(NP)或使用 RNN 或 SS 处理的语音。在每次试验中,使用两种不同的条件播放同一个句子两次。参与者根据语音可懂度(在单独的块中)和音质来表示哪一个更好,以及好多少。与 NP 和 SS 处理相比,使用 RNN 处理在主观可懂度和音质方面都明显更受欢迎,尽管偏好程度较小。SS 处理在主观可懂度或音质方面均不明显优于 NP。语音可懂度的客观计算测量表明,RNN 的可懂度优于 SS 或 NP。