Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden.
Eriksholm Research Centre, Snekkersten, Denmark.
PLoS One. 2024 Feb 8;19(2):e0297826. doi: 10.1371/journal.pone.0297826. eCollection 2024.
Perception of sounds and speech involves structures in the auditory brainstem that rapidly process ongoing auditory stimuli. The role of these structures in speech processing can be investigated by measuring their electrical activity using scalp-mounted electrodes. However, typical analysis methods involve averaging neural responses to many short repetitive stimuli that bear little relevance to daily listening environments. Recently, subcortical responses to more ecologically relevant continuous speech were detected using linear encoding models. These methods estimate the temporal response function (TRF), which is a regression model that minimises the error between the measured neural signal and a predictor derived from the stimulus. Using predictors that model the highly non-linear peripheral auditory system may improve linear TRF estimation accuracy and peak detection. Here, we compare predictors from both simple and complex peripheral auditory models for estimating brainstem TRFs on electroencephalography (EEG) data from 24 participants listening to continuous speech. We also investigate the data length required for estimating subcortical TRFs, and find that around 12 minutes of data is sufficient for clear wave V peaks (>3 dB SNR) to be seen in nearly all participants. Interestingly, predictors derived from simple filterbank-based models of the peripheral auditory system yield TRF wave V peak SNRs that are not significantly different from those estimated using a complex model of the auditory nerve, provided that the nonlinear effects of adaptation in the auditory system are appropriately modelled. Crucially, computing predictors from these simpler models is more than 50 times faster compared to the complex model. This work paves the way for efficient modelling and detection of subcortical processing of continuous speech, which may lead to improved diagnosis metrics for hearing impairment and assistive hearing technology.
听觉脑干结构能快速处理正在进行的听觉刺激,从而感知声音和言语。通过在头皮上安装电极测量这些结构的电活动,可以研究这些结构在言语处理中的作用。然而,典型的分析方法涉及到对许多短重复刺激的神经反应进行平均,这些刺激与日常听力环境几乎没有关联。最近,使用线性编码模型检测到了与更具生态相关性的连续语音相关的亚皮质反应。这些方法估计了时间响应函数 (TRF),这是一种回归模型,它最小化了测量的神经信号与从刺激中得出的预测值之间的误差。使用可以模拟高度非线性外围听觉系统的预测因子,可以提高线性 TRF 估计的准确性和峰值检测的效果。在这里,我们比较了来自简单和复杂外围听觉模型的预测因子,以估计 24 名参与者在连续语音下进行脑电 (EEG) 数据的脑干 TRF。我们还研究了估计亚皮质 TRF 所需的数据长度,发现大约 12 分钟的数据足以在几乎所有参与者中看到清晰的波 V 峰值 (>3dB SNR)。有趣的是,从外围听觉系统的简单滤波器组模型得出的预测因子,只要适当地模拟了听觉系统中的适应非线性效应,就能得到与听觉神经复杂模型相当的 TRF 波 V 峰值 SNR。至关重要的是,与复杂模型相比,这些更简单模型的预测因子计算速度要快 50 多倍。这项工作为连续言语的亚皮质处理的高效建模和检测铺平了道路,这可能会为听力障碍的诊断指标和助听技术带来改进。