Otolaryngology Head and Neck Surgery, Ear & Hearing, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands.
Eriksholm Research Centre, Snekkersten, Denmark.
Trends Hear. 2024 Jan-Dec;28:23312165241232551. doi: 10.1177/23312165241232551.
In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean = 64.6 years, SD = 9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD = 10.2) for task demand, 88.0% (SD = 7.5) for social context, and 60.0% (SD = 13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.
在日常生活中,声学因素和社会环境都会影响听力努力的投入。在实验室环境中,已经独立地从瞳孔和心血管反应中推断出有关听力努力的信息。这些措施可以共同预测与听力相关的因素的程度尚不清楚。在这里,我们将瞳孔和心血管特征相结合,以预测言语感知的声学和语境方面。数据是从 29 名听力损失成年人(平均年龄 64.6 岁,标准差 9.2)中收集的。参与者在两个个性化信噪比(分别对应于句子正确的 50%和 80%)和两个社会环境(存在和不存在两个观察者)中进行了言语感知任务。每个试验提取了七个特征:基础瞳孔大小、瞳孔峰值扩张、平均瞳孔扩张、心动间隔、血容量脉搏幅度、射血前期和脉搏到达时间。这些特征用于训练 K 最近邻分类器,以预测任务需求、社会环境和句子准确性。对组水平数据的 K 折交叉验证显示了高于平均的分类准确性:任务需求,64.4%;社会环境,78.3%;句子准确性,55.1%。然而,当分类器在不同参与者的数据上进行训练和测试时,分类准确性会降低。个体训练的分类器(每个参与者一个)的性能优于组水平分类器:任务需求为 71.7%(标准差 10.2%),社会环境为 88.0%(标准差 7.5%),句子准确性为 60.0%(标准差 13.1%)。我们证明,使用基于群体生理数据训练的分类器来预测言语感知方面的性能,对新参与者的泛化能力较差。个体校准的分类器在未来的应用中更有前途。