Buss Emily, Bosen Adam
Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA.
Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
JASA Express Lett. 2021 Aug;1(8):084402. doi: 10.1121/10.0005762. Epub 2021 Aug 2.
Predicting masked speech perception typically relies on estimates of the spectral distribution of cues supporting recognition. Current methods for estimating band importance for speech-in-noise use filtered stimuli. These methods are not appropriate for speech-in-speech because filtering can modify stimulus features affecting auditory stream segregation. Here, band importance is estimated by quantifying the relationship between speech recognition accuracy for full-spectrum speech and the target-to-masker ratio by channel at the output of an auditory filterbank. Preliminary results provide support for this approach and indicate that frequencies below 2 kHz may contribute more to speech recognition in two-talker speech than in speech-shaped noise.
预测掩蔽语音感知通常依赖于对支持识别的线索频谱分布的估计。当前用于估计噪声中语音频段重要性的方法使用滤波后的刺激。这些方法不适用于语音中语音的情况,因为滤波会改变影响听觉流分离的刺激特征。在这里,通过量化听觉滤波器组输出端全频谱语音的语音识别准确率与通道处目标与掩蔽比之间的关系来估计频段重要性。初步结果支持了这种方法,并表明在双说话者语音中,低于2kHz的频率对语音识别的贡献可能比在语音形状噪声中更大。