Oxenham Andrew J, Simonson Andrea M, Turicchia Lorenzo, Sarpeshkar Rahul
Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
J Acoust Soc Am. 2007 Mar;121(3):1709-16. doi: 10.1121/1.2434757.
This study tested a time-domain spectral enhancement algorithm that was recently proposed by Turicchia and Sarpeshkar [IEEE Trans. Speech Audio Proc. 13, 243-253 (2005)]. The algorithm uses a filter bank, with each filter channel comprising broadly tuned amplitude compression, followed by more narrowly tuned expansion (companding). Normal-hearing listeners were tested in their ability to recognize sentences processed through a noise-excited envelope vocoder that simulates aspects of cochlear-implant processing. The sentences were presented in a steady background noise at signal-to-noise ratios of 0, 3, and 6 dB and were either passed directly through an envelope vocoder, or were first processed by the companding algorithm. Using an eight-channel envelope vocoder, companding produced small but significant improvements in speech reception. Parametric variations of the companding algorithm showed that the improvement in intelligibility was robust to changes in filter tuning, whereas decreases in the time constants resulted in a decrease in intelligibility. Companding continued to provide a benefit when the number of vocoder frequency channels was increased to sixteen. When integrated within a sixteen-channel cochlear-implant simulator, companding also led to significant improvements in sentence recognition. Thus, companding may represent a readily implementable way to provide some speech recognition benefits to current cochlear-implant users.
本研究测试了Turicchia和Sarpeshkar最近提出的一种时域频谱增强算法[《IEEE语音音频处理汇刊》13, 243 - 253 (2005)]。该算法使用一个滤波器组,每个滤波器通道包括大致调谐的幅度压缩,随后是更窄调谐的扩展(压扩)。对听力正常的受试者进行测试,以考察他们识别通过模拟人工耳蜗处理某些方面的噪声激励包络声码器处理的句子的能力。句子在信噪比为0、3和6 dB的稳定背景噪声中呈现,要么直接通过包络声码器,要么首先由压扩算法进行处理。使用八通道包络声码器时,压扩在语音接收方面产生了虽小但显著的改善。压扩算法的参数变化表明,可懂度的提高对滤波器调谐的变化具有鲁棒性,而时间常数的减小会导致可懂度下降。当声码器频率通道数增加到16个时,压扩仍然带来益处。当集成到一个16通道人工耳蜗模拟器中时,压扩也显著提高了句子识别能力。因此,压扩可能是一种易于实现的方法,可为当前人工耳蜗使用者提供一些语音识别方面的益处。