Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yishan Road, Xuhui District, Shanghai, 200233, China.
Department of Otorhinolaryngology Head and Neck Surgery, Shanghai Key Clinical Disciplines of Otorhinolaryngology, Eye and ENT Hospital of Fudan University, 83 Fenyang Road, Xuhui District, Shanghai, 200031, China.
BMC Neurosci. 2022 Jun 13;23(1):35. doi: 10.1186/s12868-022-00721-z.
Temporal envelope cues are conveyed by cochlear implants (CIs) to hearing loss patients to restore hearing. Although CIs could enable users to communicate in clear listening environments, noisy environments still pose a problem. To improve speech-processing strategies used in Chinese CIs, we explored the relative contributions made by the temporal envelope in various frequency regions, as relevant to Mandarin sentence recognition in noise.
Original speech material from the Mandarin version of the Hearing in Noise Test (MHINT) was mixed with speech-shaped noise (SSN), sinusoidally amplitude-modulated speech-shaped noise (SAM SSN), and sinusoidally amplitude-modulated (SAM) white noise (4 Hz) at a + 5 dB signal-to-noise ratio, respectively. Envelope information of the noise-corrupted speech material was extracted from 30 contiguous bands that were allocated to five frequency regions. The intelligibility of the noise-corrupted speech material (temporal cues from one or two regions were removed) was measured to estimate the relative weights of temporal envelope cues from the five frequency regions.
In SSN, the mean weights of Regions 1-5 were 0.34, 0.19, 0.20, 0.16, and 0.11, respectively; in SAM SSN, the mean weights of Regions 1-5 were 0.34, 0.17, 0.24, 0.14, and 0.11, respectively; and in SAM white noise, the mean weights of Regions 1-5 were 0.46, 0.24, 0.22, 0.06, and 0.02, respectively.
The results suggest that the temporal envelope in the low-frequency region transmits the greatest amount of information in terms of Mandarin sentence recognition for three types of noise, which differed from the perception strategy employed in clear listening environments.
时间包络线索通过人工耳蜗(CIs)传递给听力损失患者以恢复听力。虽然 CIs 可以使使用者在清晰的聆听环境中进行交流,但嘈杂的环境仍然是一个问题。为了改善用于中文 CIs 的语音处理策略,我们探索了各种频率区域中的时间包络的相对贡献,这与噪声中的普通话句子识别有关。
来自普通话版听力障碍测试(MHINT)的原始语音材料与语音噪声(SSN)、正弦幅度调制语音噪声(SAM SSN)和正弦幅度调制(SAM)白噪声(4Hz)混合,信噪比分别为+5dB。噪声污染语音材料的包络信息从分配到五个频率区域的 30 个连续频带中提取。噪声污染语音材料的可懂度(去除一个或两个区域的时间线索)用于估计五个频率区域的时间包络线索的相对权重。
在 SSN 中,区域 1-5 的平均权重分别为 0.34、0.19、0.20、0.16 和 0.11;在 SAM SSN 中,区域 1-5 的平均权重分别为 0.34、0.17、0.24、0.14 和 0.11;在 SAM 白噪声中,区域 1-5 的平均权重分别为 0.46、0.24、0.22、0.06 和 0.02。
结果表明,在三种噪声中,低频区域的时间包络传递了普通话句子识别的最大信息量,这与清晰聆听环境中使用的感知策略不同。