具有理想时频分离的多说话者语音感知：嗓音特征和说话者数量的影响。

Multitalker speech perception with ideal time-frequency segregation: effects of voice characteristics and number of talkers.

作者信息

Brungart Douglas S, Chang Peter S, Simpson Brian D, Wang DeLiang

机构信息

Air Force Research Laboratory, Human Effectiveness Directorate, Wright-Patterson AFB, Ohio 45433, USA.

出版信息

J Acoust Soc Am. 2009 Jun;125(6):4006-22. doi: 10.1121/1.3117686.

DOI:10.1121/1.3117686

PMID:19507982

Abstract

When a target voice is masked by an increasingly similar masker voice, increases in energetic masking are likely to occur due to increased spectro-temporal overlap in the competing speech waveforms. However, the impact of this increase may be obscured by informational masking effects related to the increased confusability of the target and masking utterances. In this study, the effects of target-masker similarity and the number of competing talkers on the energetic component of speech-on-speech masking were measured with an ideal time-frequency segregation (ITFS) technique that retained all the target-dominated time-frequency regions of a multitalker mixture but eliminated all the time-frequency regions dominated by the maskers. The results show that target-masker similarity has a small but systematic impact on energetic masking, with roughly a 1 dB release from masking for same-sex maskers versus same-talker maskers and roughly an additional 1 dB release from masking for different-sex masking voices. The results of a second experiment measuring ITFS performance with up to 18 interfering talkers indicate that energetic masking increased systematically with the number of competing talkers. These results suggest that energetic masking differences related to target-masker similarity have a much smaller impact on multitalker listening performance than energetic masking effects related to the number of competing talkers in the stimulus and non-energetic masking effects related to the confusability of the target and masking voices.

摘要

当目标语音被越来越相似的掩蔽语音掩盖时，由于竞争语音波形中频谱-时间重叠增加，能量掩蔽很可能会增强。然而，这种增强的影响可能会被与目标语音和掩蔽语音可混淆性增加相关的信息掩蔽效应所掩盖。在本研究中，采用理想时频分离（ITFS）技术测量了目标-掩蔽语音相似度和竞争说话者数量对语音对语音掩蔽能量成分的影响，该技术保留了多说话者混合语音中所有以目标语音为主导的时频区域，但消除了所有以掩蔽语音为主导的时频区域。结果表明，目标-掩蔽语音相似度对能量掩蔽有微小但系统的影响，同性掩蔽语音与同一说话者掩蔽语音相比，掩蔽解除约1 dB，不同性别的掩蔽语音相比，掩蔽解除约额外增加1 dB。第二个实验测量了多达18个干扰说话者的ITFS性能，结果表明能量掩蔽随着竞争说话者数量的增加而系统性增强。这些结果表明，与目标-掩蔽语音相似度相关的能量掩蔽差异对多说话者听力性能的影响，远小于与刺激中竞争说话者数量相关的能量掩蔽效应以及与目标语音和掩蔽语音可混淆性相关的非能量掩蔽效应。