Department of Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA.
Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA.
J Acoust Soc Am. 2023 Aug 1;154(2):602-618. doi: 10.1121/10.0020536.
Fricatives are obstruent sound contrasts made by airflow constrictions in the vocal tract that produce turbulence across the constriction or at a site downstream from the constriction. Fricatives exhibit significant intra/intersubject and contextual variability. Yet, fricatives are perceived with high accuracy. The current study investigated modeled neural responses to fricatives in the auditory nerve (AN) and inferior colliculus (IC) with the hypothesis that response profiles across populations of neurons provide robust correlates to consonant perception. Stimuli were 270 intervocalic fricatives (10 speakers × 9 fricatives × 3 utterances). Computational model response profiles had characteristic frequencies that were log-spaced from 125 Hz to 8 or 20 kHz to explore the impact of high-frequency responses. Confusion matrices generated by k-nearest-neighbor subspace classifiers were based on the profiles of average rates across characteristic frequencies as feature vectors. Model confusion matrices were compared with published behavioral data. The modeled AN and IC neural responses provided better predictions of behavioral accuracy than the stimulus spectra, and IC showed better accuracy than AN. Behavioral fricative accuracy was explained by modeled neural response profiles, whereas confusions were only partially explained. Extended frequencies improved accuracy based on the model IC, corroborating the importance of extended high frequencies in speech perception.
擦音是由声道中的气流限制产生的阻碍性声音对比,在限制处或限制处下游的位置产生湍流。擦音表现出显著的个体内/个体间和语境可变性。然而,擦音的感知准确性很高。本研究通过假设神经元群体的反应谱为协同感知提供了强大的相关因素,调查了听觉神经(AN)和下丘(IC)中擦音的模型神经反应。刺激物是 270 个元音间擦音(10 个说话者×9 个擦音×3 个发音)。计算模型的反应谱具有特征频率,从 125Hz 对数间隔到 8kHz 或 20kHz,以探索高频响应的影响。基于特征频率上的平均速率的轮廓作为特征向量生成 k-最近邻子空间分类器的混淆矩阵。将模型混淆矩阵与已发表的行为数据进行比较。与刺激光谱相比,模型化的 AN 和 IC 神经反应提供了更好的行为准确性预测,而 IC 比 AN 提供了更好的准确性。行为擦音准确性可以通过模型化的神经反应谱来解释,而混淆仅部分解释。扩展频率提高了基于模型 IC 的准确性,证实了扩展高频在语音感知中的重要性。