Department of Speech-Language Pathology and Audiology, Towson University, MD.
Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe.
J Speech Lang Hear Res. 2022 Aug 17;65(8):3146-3164. doi: 10.1044/2022_JSLHR-21-00576. Epub 2022 Aug 9.
The objective of this study was to determine if and how the subcortical neural representation of pitch cues in listeners with normal hearing is affected by systematic manipulation of vocoder parameters.
This study assessed the effects of temporal envelope cutoff frequency (50 and 500 Hz), number of channels (1-32), and carrier type (sine-wave and noise-band) on brainstem neural representation of fundamental frequency ( ) in frequency-following responses (FFRs) to vocoded vowels of 15 young adult listeners with normal hearing.
Results showed that FFR strength (quantified as absolute magnitude divided by noise floor [NF] magnitude) significantly improved with 500-Hz vs. 50-Hz temporal envelopes for all channel numbers and both carriers except the 1-channel noise-band vocoder. FFR strength with 500-Hz temporal envelopes significantly improved when the channel number increased from 1 to 2, but it either declined (sine-wave vocoders) or saturated (noise-band vocoders) when the channel number increased from 4 to 32. FFR strength with 50-Hz temporal envelopes was similarly small for both carriers with all channel numbers, except for a significant improvement with the 16-channel sine-wave vocoder. With 500-Hz temporal envelopes, FFR strength was significantly greater for sine-wave vocoders than for noise-band vocoders with channel numbers 1-8; no significant differences were seen with 16 and 32 channels. With 50-Hz temporal envelopes, the carrier effect was only observed with 16 channels. In contrast, there was no significant carrier effect for the absolute magnitude. Compared to sine-wave vocoders, noise-band vocoders had a higher NF and thus lower relative FFR strength.
It is important to normalize the magnitude relative to the NF when analyzing the FFRs to vocoded speech. The physiological findings reported here may result from the availability of -related temporal periodicity and spectral sidelobes in vocoded signals and should be considered when selecting vocoder parameters and interpreting results in future physiological studies. In general, the dependence of brainstem neural phase-locking strength to on vocoder parameters may confound the comparison of pitch-related behavioral results across different vocoder designs.
本研究旨在确定听力正常的个体的皮质下神经对音高线索的表现是否受到声码器参数系统变化的影响,以及如何受到影响。
本研究评估了时域包络截止频率(50Hz 和 500Hz)、通道数量(1-32)和载波类型(正弦波和噪声带)对 15 名听力正常的年轻成年被试者的基本频率()在频率跟随反应(FFR)中对声码化元音的脑stem 神经表现的影响。
结果表明,对于所有通道数量和两种载波类型(除了 1 通道噪声带声码器),与 50Hz 相比,500Hz 的时域包络使 FFR 的强度(用绝对幅度除以噪声基底[NF]幅度来量化)显著提高。当通道数量从 1 增加到 2 时,500Hz 时域包络的 FFR 强度显著提高,但当通道数量从 4 增加到 32 时,FFR 强度要么下降(正弦波声码器),要么饱和(噪声带声码器)。对于所有通道数量,50Hz 时域包络的 FFR 强度对于两种载波类型都很小,除了 16 通道正弦波声码器有显著提高外。对于 500Hz 时域包络,与噪声带声码器相比,正弦波声码器的 FFR 强度在通道数量为 1-8 时显著更高;在 16 和 32 个通道时没有观察到显著差异。对于 50Hz 时域包络,仅在 16 个通道时观察到载波效应。相比之下,绝对幅度没有显著的载波效应。与正弦波声码器相比,噪声带声码器具有更高的 NF,因此相对 FFR 强度更低。
在分析声码化语音的 FFR 时,重要的是要将幅度相对于 NF 进行归一化。这里报道的生理学发现可能是由于声码化信号中存在与 相关的时间周期性和频谱旁瓣,因此在选择声码器参数和解释未来生理学研究结果时应予以考虑。一般来说,脑干神经相位锁定强度对 的依赖性可能会混淆不同声码器设计之间与音高相关的行为结果的比较。