Department of Communication Sciences & Disorders Moody College of Communication The University of Texas at Austin Austin TX USA.
Department of Electrical and Computer Engineering Cockrell School of Engineering The University of Texas at Austin Austin TX USA.
Brain Behav. 2017 Apr 26;7(6):e00665. doi: 10.1002/brb3.665. eCollection 2017 Jun.
Scalp-recorded electrophysiological responses to complex, periodic auditory signals reflect phase-locked activity from neural ensembles within the auditory system. These responses, referred to as frequency-following responses (FFRs), have been widely utilized to index typical and atypical representation of speech signals in the auditory system. One of the major limitations in FFR is the low signal-to-noise ratio at the level of single trials. For this reason, the analysis relies on averaging across thousands of trials. The ability to examine the quality of single-trial FFRs will allow investigation of trial-by-trial dynamics of the FFR, which has been impossible due to the averaging approach.
In a novel, data-driven approach, we used machine learning principles to decode information related to the speech signal from single trial FFRs. FFRs were collected from participants while they listened to two vowels produced by two speakers. Scalp-recorded electrophysiological responses were projected onto a low-dimensional spectral feature space independently derived from the same two vowels produced by 40 speakers, which were not presented to the participants. A novel supervised machine learning classifier was trained to discriminate vowel tokens on a subset of FFRs from each participant, and tested on the remaining subset.
We demonstrate reliable decoding of speech signals at the level of single-trials by decomposing the raw FFR based on information-bearing spectral features in the speech signal that were independently derived.
Taken together, the ability to extract interpretable features at the level of single-trials in a data-driven manner offers unchartered possibilities in the noninvasive assessment of human auditory function.
头皮记录的复杂、周期性听觉信号的电生理反应反映了听觉系统中神经集合的锁相活动。这些反应被称为频率跟随反应(FFR),已被广泛用于指标听觉系统中言语信号的典型和非典型表示。FFR 的主要限制之一是单个试验的信噪比低。因此,分析依赖于数千次试验的平均。检查单个试验 FFR 质量的能力将允许研究 FFR 的逐次试验动态,由于平均方法,这是不可能的。
在一种新颖的数据驱动方法中,我们使用机器学习原理从单个试验 FFR 中解码与言语信号相关的信息。当参与者聆听由两个说话者产生的两个元音时,收集 FFR。头皮记录的电生理反应被投射到一个低维频谱特征空间上,该空间是由 40 个说话者产生的相同两个元音独立得出的,这些元音没有呈现给参与者。一种新的监督机器学习分类器被训练来区分每个参与者的 FFR 的子集上的元音令牌,并在剩余子集上进行测试。
我们通过基于独立得出的言语信号中承载信息的频谱特征来分解原始 FFR,从而在单个试验水平上可靠地解码言语信号。
总之,以数据驱动的方式在单个试验水平上提取可解释特征的能力为非侵入性评估人类听觉功能提供了前所未有的可能性。