Zhang Yue, Johannesen Peter T, Molaee-Ardekani Behnam, Wijetillake Aswin, Attili Chiea Rafael, Hasan Pierre-Yves, Segovia-Martínez Manuel, Lopez-Poveda Enrique A
Department of Research and Technology, Oticon Medical, Vallauris, France.
Laboratorio de Audición Computacional y Piscoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Salamanca, Spain.
Ear Hear. 2025;46(1):163-183. doi: 10.1097/AUD.0000000000001565. Epub 2024 Sep 6.
We compared sound quality and performance for a conventional cochlear-implant (CI) audio processing strategy based on short-time fast-Fourier transform (Crystalis) and an experimental strategy based on spectral feature extraction (SFE). In the latter, the more salient spectral features (acoustic events) were extracted and mapped into the CI stimulation electrodes. We hypothesized that (1) SFE would be superior to Crystalis because it can encode acoustic spectral features without the constraints imposed by the short-time fast-Fourier transform bin width, and (2) the potential benefit of SFE would be greater for CI users who have less neural cross-channel interactions.
To examine the first hypothesis, 6 users of Oticon Medical Digisonic SP CIs were tested in a double-blind design with the SFE and Crystalis strategies on various aspects: word recognition in quiet, speech-in-noise reception threshold (SRT), consonant discrimination in quiet, listening effort, melody contour identification (MCI), and subjective sound quality. Word recognition and SRTs were measured on the first and last day of testing (4 to 5 days apart) to assess potential learning and/or acclimatization effects. Other tests were run once between the first and last testing day. Listening effort was assessed by measuring pupil dilation. MCI involved identifying a five-tone contour among five possible contours. Sound quality was assessed subjectively using the multiple stimulus with hidden reference and anchor (MUSHRA) paradigm for sentences, music, and ambient sounds. To examine the second hypothesis, cross-channel interaction was assessed behaviorally using forward masking.
Word recognition was similar for the two strategies on the first day of testing and improved for both strategies on the last day of testing, with Crystalis improving significantly more. SRTs were worse with SFE than Crystalis on the first day of testing but became comparable on the last day of testing. Consonant discrimination scores were higher for Crystalis than for the SFE strategy. MCI scores and listening effort were not substantially different across strategies. Subjective sound quality scores were lower for the SFE than for the Crystalis strategy. The difference in performance with SFE and Crystalis was greater for CI users with higher channel interaction.
CI-user performance was similar with the SFE and Crystalis strategies. Longer acclimatization times may be required to reveal the full potential of the SFE strategy.
我们比较了基于短时快速傅里叶变换的传统人工耳蜗(CI)音频处理策略(Crystalis)和基于频谱特征提取(SFE)的实验性策略的声音质量和性能。在后者中,更显著的频谱特征(声学事件)被提取并映射到CI刺激电极上。我们假设:(1)SFE将优于Crystalis,因为它可以对声学频谱特征进行编码,而不受短时快速傅里叶变换频段宽度的限制;(2)对于神经跨通道交互较少的CI用户,SFE的潜在益处会更大。
为检验第一个假设,对6名使用奥迪康医疗Digisonic SP CI的用户进行了双盲设计测试,比较SFE和Crystalis策略在各个方面的表现:安静环境下的单词识别、噪声中言语接受阈值(SRT)、安静环境下的辅音辨别、聆听努力程度、旋律轮廓识别(MCI)和主观声音质量。在测试的第一天和最后一天(相隔4至5天)测量单词识别和SRT,以评估潜在的学习和/或适应效应。其他测试在第一天和最后一天测试之间进行一次。通过测量瞳孔扩张来评估聆听努力程度。MCI包括在五个可能的轮廓中识别一个五音轮廓。使用带有隐藏参考和锚点的多刺激(MUSHRA)范式对句子、音乐和环境声音进行主观声音质量评估。为检验第二个假设,使用前掩蔽行为评估跨通道交互。
在测试的第一天,两种策略的单词识别相似,在测试的最后一天,两种策略的单词识别均有所提高,其中Crystalis提高得更为显著。在测试的第一天,SFE的SRT比Crystalis差,但在测试的最后一天两者相当。Crystalis的辅音辨别分数高于SFE策略。不同策略的MCI分数和聆听努力程度没有实质性差异。SFE的主观声音质量分数低于Crystalis策略。对于通道交互较高的CI用户,SFE和Crystalis在性能上的差异更大。
CI用户使用SFE和Crystalis策略的表现相似。可能需要更长的适应时间来充分发挥SFE策略的潜力。