Department of Otolaryngology, University Hospital Regensburg, Regensburg, Germany.
Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
Ear Hear. 2021 Sep/Oct;42(5):1208-1217. doi: 10.1097/AUD.0000000000001003.
In contrast to the moderate presentation levels most commonly used in clinical practice, speech encountered in everyday life often occurs at low levels, such as when a conversational partner whispers or speaks from another room. In addition, even when the overall signal level is moderate, levels for particular words or speech sounds, such as voiceless consonants, can be considerably lower. Existing techniques for improving recognition of low-level speech for cochlear implant users include using a wider input dynamic range and elevating electrode threshold levels (T-levels). While these techniques tend to positively impact recognition of soft speech, each has also been associated with drawbacks. Recently, a noise-gating (NG) algorithm was reported, which works by eliminating input to an electrode when signal level in the associated frequency channel is at or below a predetermined threshold. Available evidence suggests that activation of this algorithm can improve recognition of sentences presented at low levels (35 to 50 dB SPL), though it remains unclear whether the benefits will be equally evident with both manufacturer default and individually optimized T-levels. The primary aim of this study was therefore to evaluate the independent and combined effects of NG activation and T-level personalization.
Twenty adults between the ages of 25 and 77 years (M = 54.9 years, SD = 17.56) with postlingually acquired profound hearing loss completed testing for this study. Participants were fit with an Advanced Bionics Naida CI Q90 speech processor, which contained four programs based on each participant's existing everyday program. The programs varied by the NG algorithm setting (on, off) and T-level method (default 10% of M-level, personalized based on subjective ratings of "very quiet"). All participants completed speech sound detection threshold testing (/m/, /u/, /a/, /i/, /s/, and /∫/), as well as tests of monosyllabic word recognition in quiet (45 and 60 dB SPL), sentence recognition in quiet (45 and 60 dB SPL), and sentence recognition in noise (45-dB SPL speech, +10 dB SNR).
Findings demonstrated that both activating NG and personalizing T-levels in isolation significantly improved detection (speech sounds) and recognition (monosyllables, sentences in quiet, and sentences in noise) of soft speech, with their respective individual effects being comparable. However, the lowest speech sound detection thresholds and the highest speech recognition performance were identified when NG was activated in conjunction with personalized T-levels. Importantly, neither T-level personalization nor NG activation affected recognition of speech presented at 60 dB SPL, which suggests the strategies should not be expected to interfere with recognition of average conversational speech.
Taken together, these data support the clinical recommendation of personalizing T-levels and activating NG to improve the detection and recognition of soft speech. However, future work is needed to evaluate potential limitations of these techniques. Specifically, speech recognition testing should be performed in the presence of diverse noise backgrounds and home-trials should be conducted to determine processing effects on sound quality in realistic environments.
与临床实践中常用的中度表现水平相比,日常生活中遇到的言语通常处于较低水平,例如当对话伙伴低语或在另一个房间说话时。此外,即使整体信号水平适中,特定单词或语音的水平,如清音辅音,也可能低得多。用于提高人工耳蜗使用者对低水平语音识别的现有技术包括使用更宽的输入动态范围和提高电极阈值水平(T 水平)。虽然这些技术往往对识别轻柔语音有积极影响,但每种技术都有其缺点。最近,报道了一种噪声门控(NG)算法,该算法通过在相关频率通道中的信号水平达到或低于预定阈值时,消除电极的输入来工作。现有证据表明,激活该算法可以提高对低水平(35 至 50 dB SPL)呈现的句子的识别能力,尽管尚不清楚制造商默认和个性化优化的 T 水平是否会同样明显受益。因此,本研究的主要目的是评估 NG 激活和 T 水平个性化的独立和综合效果。
20 名年龄在 25 至 77 岁之间(M = 54.9 岁,SD = 17.56)的后天性重度听力损失成年人完成了这项研究的测试。参与者配备了先进的仿生 Naida CI Q90 语音处理器,该处理器包含四个基于每个参与者现有日常程序的程序。程序因 NG 算法设置(开/关)和 T 水平方法(默认 10%的 M 水平,基于“非常安静”的主观评分进行个性化)而异。所有参与者都完成了语音检测阈值测试(/m/、/u/、/a/、/i/、/s/和/∫/),以及安静状态下的单音节词识别测试(45 和 60 dB SPL)、安静状态下的句子识别测试(45 和 60 dB SPL)以及噪声中的句子识别测试(45-dB SPL 语音,+10 dB SNR)。
研究结果表明,单独激活 NG 和个性化 T 水平均可显著提高轻柔语音的检测(语音)和识别(单音节词、安静状态下的句子和噪声中的句子),其各自的单独效果相当。然而,当 NG 与个性化 T 水平结合使用时,语音检测的最低语音检测阈值和最高语音识别性能被确定。重要的是,无论个性化 T 水平还是 NG 激活都不会影响 60 dB SPL 呈现的语音识别,这表明这些策略不应干扰对平均会话语音的识别。
综上所述,这些数据支持个性化 T 水平和激活 NG 以提高轻柔语音检测和识别的临床建议。然而,需要进一步的工作来评估这些技术的潜在局限性。具体来说,应在各种噪声背景下进行语音识别测试,并进行家庭试验以确定在现实环境中对声音质量的处理效果。