Desai Maansi, Field Alyssa M, Hamilton Liberty S
Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States.
Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, United States.
Front Hum Neurosci. 2023 Jan 20;16:1001171. doi: 10.3389/fnhum.2022.1001171. eCollection 2022.
In many experiments that investigate auditory and speech processing in the brain using electroencephalography (EEG), the experimental paradigm is often lengthy and tedious. Typically, the experimenter errs on the side of including more data, more trials, and therefore conducting a longer task to ensure that the data are robust and effects are measurable. Recent studies used naturalistic stimuli to investigate the brain's response to individual or a combination of multiple speech features using system identification techniques, such as multivariate temporal receptive field (mTRF) analyses. The neural data collected from such experiments must be divided into a training set and a test set to fit and validate the mTRF weights. While a good strategy is clearly to collect as much data as is feasible, it is unclear how much data are needed to achieve stable results. Furthermore, it is unclear whether the specific stimulus used for mTRF fitting and the choice of feature representation affects how much data would be required for robust and generalizable results. Here, we used previously collected EEG data from our lab using sentence stimuli and movie stimuli as well as EEG data from an open-source dataset using audiobook stimuli to better understand how much data needs to be collected for naturalistic speech experiments measuring acoustic and phonetic tuning. We found that the EEG receptive field structure tested here stabilizes after collecting a training dataset of approximately 200 s of TIMIT sentences, around 600 s of movie trailers training set data, and approximately 460 s of audiobook training set data. Thus, we provide suggestions on the minimum amount of data that would be necessary for fitting mTRFs from naturalistic listening data. Our findings are motivated by highly practical concerns when working with children, patient populations, or others who may not tolerate long study sessions. These findings will aid future researchers who wish to study naturalistic speech processing in healthy and clinical populations while minimizing participant fatigue and retaining signal quality.
在许多使用脑电图(EEG)研究大脑听觉和言语处理的实验中,实验范式往往冗长乏味。通常,实验者倾向于纳入更多数据、更多试验,从而进行更长时间的任务,以确保数据可靠且效应可测量。最近的研究使用自然主义刺激,通过系统识别技术,如多变量时间感受野(mTRF)分析,来研究大脑对单个或多个言语特征组合的反应。从此类实验收集的神经数据必须分为训练集和测试集,以拟合和验证mTRF权重。虽然显然一个好的策略是尽可能多地收集可行的数据,但尚不清楚需要多少数据才能获得稳定的结果。此外,尚不清楚用于mTRF拟合的特定刺激以及特征表示的选择是否会影响获得可靠且可推广结果所需的数据量。在这里,我们使用了我们实验室之前收集的使用句子刺激和电影刺激的EEG数据,以及来自一个开源数据集的使用有声读物刺激的EEG数据,以更好地了解在测量声学和语音调谐的自然主义言语实验中需要收集多少数据。我们发现,在这里测试的EEG感受野结构在收集大约200秒的TIMIT句子训练数据集、大约600秒的电影预告片训练集数据和大约460秒的有声读物训练集数据后趋于稳定。因此,我们就从自然主义听觉数据拟合mTRF所需的最小数据量提供了建议。我们的研究结果是出于与儿童、患者群体或其他可能无法耐受长时间研究的人群合作时的高度实际考虑。这些发现将有助于未来希望在健康和临床人群中研究自然主义言语处理,同时将参与者疲劳降至最低并保持信号质量的研究人员。