GN ReSound A/S, 3215 Marine Street, Room W161, Boulder, Colorado 80309, USA.
J Acoust Soc Am. 2011 Jun;129(6):3981-90. doi: 10.1121/1.3583552.
The study of speech from which the temporal fine structure (TFS) has been removed has become an important research area. Common procedures for removing TFS include noise and tone vocoders. In the noise vocoder, bands of noise are modulated by the envelope of the speech within each band, and in the tone vocoder the carrier is a sinusoid at the center of each frequency band. Five different procedures for removing TFS are evaluated in this paper: the noise vocoder, a low-noise noise approach in which the noise envelope is replaced by the speech envelope in each frequency band, phase randomization within each band, the tone vocoder, and sinusoidal modeling with random phase. The effects of TFS modification on the speech envelope are evaluated using an index based on the envelope time-frequency modulation. The results show that for all of the TFS techniques implemented in this study, there is a substantial loss in the accuracy of reproduction of the envelope time-frequency modulation. The tone vocoder gives the best accuracy, followed by the procedure that replaces the noise envelope with the speech envelope in each band.
去除了时域精细结构(TFS)的语音研究已成为一个重要的研究领域。去除 TFS 的常见方法包括噪声和音调声码器。在噪声声码器中,噪声带通过每个带内语音的包络进行调制,而在音调声码器中,载波是每个频带中心的正弦波。本文评估了五种不同的去除 TFS 的方法:噪声声码器、一种用每个频带中的语音包络代替噪声包络的低噪声噪声方法、每个频带内的相位随机化、音调声码器和具有随机相位的正弦建模。使用基于包络时频调制的指数来评估 TFS 修饰对语音包络的影响。结果表明,对于本研究中实施的所有 TFS 技术,在包络时频调制的再现准确性方面都存在很大损失。音调声码器的准确性最高,其次是用每个频带中的语音包络替换噪声包络的方法。