Suppr超能文献

正常听力者和人工耳蜗使用者对语速对合成语音和自然语音识别的影响。

Effect of speaking rate on recognition of synthetic and natural speech by normal-hearing and cochlear implant listeners.

机构信息

Department of Otorhinolaryngology, Shandong University, Qi Lu Hospital, Jinan, People's Republic of China.

出版信息

Ear Hear. 2013 May-Jun;34(3):313-23. doi: 10.1097/AUD.0b013e31826fe79e.

Abstract

OBJECTIVE

Most studies have evaluated cochlear implant (CI) performance using "clear" speech materials, which are highly intelligible and well articulated. CI users may encounter much greater variability in speech patterns in the "real world," including synthetic speech. In this study, the authors measured sentence recognition with multiple talkers and speaking rates, and with naturally produced and synthetic speech in listeners with normal hearing (NH) and CIs.

DESIGN

NH and CI subjects were asked to recognize naturally produced or synthetic sentences, presented at a slow, normal, or fast speaking rate. Natural speech was produced by one male and one female talker; synthetic speech was generated to simulate a male and female talker. For natural speech, the speaking rate was time-scaled while preserving voice pitch and formant frequency information. For synthetic speech, the speaking rate was adjusted within the speech synthesis engine. NH subjects were tested while listening to unprocessed speech or to an eight-channel acoustic CI simulation. CI subjects were tested while listening with their clinical processors and the recommended microphone sensitivity and volume settings.

RESULTS

The NH group performed significantly better than did the CI-simulation group, and the CI-simulation group performed significantly better than did the CI group. For all subject groups, sentence recognition was significantly better with natural speech than with synthetic speech. The performance deficit with synthetic speech was relatively small for NH subjects listening to unprocessed speech. However, the performance deficit with synthetic speech was much greater for CI subjects and for CI-simulation subjects. There was significant effect of talker gender, with slightly better performance with the female talker for CI subjects and slightly better performance with the male talker for the CI simulations. For all subject groups, sentence recognition was significantly poorer only at the fast rate. CI performance was very poor (approximately 10% correct) at the fast rate.

CONCLUSIONS

CI listeners are susceptible to variability in speech patterns caused by speaking rate and production style (natural versus synthetic). CI performance with clear speech materials may overestimate performance in real-world listening conditions. The poorer CI performance may be because of other factors besides reduced spectro-temporal resolution, such the quality of electric stimulation, duration of deafness, or cortical processing. Optimizing the input or training may improve CI users' tolerance for variability in speech patterns.

摘要

目的

大多数研究使用“清晰”的语音材料评估人工耳蜗(CI)的性能,这些材料具有较高的可理解度和清晰度。CI 用户在“真实世界”中可能会遇到更多的语音模式变化,包括合成语音。在这项研究中,作者测量了正常听力(NH)和 CI 用户对多说话者和不同语速、自然产生和合成语音的句子识别能力。

设计

NH 和 CI 受试者被要求识别自然产生或合成的句子,以慢、正常或快的语速呈现。自然语音由一男一女两个说话者发出;合成语音由语音合成器生成,以模拟男、女说话者的语音。对于自然语音,语速通过时间缩放来调整,同时保留语音音调和共振峰频率信息。对于合成语音,语速在语音合成引擎内进行调整。NH 受试者在听未经处理的语音或八通道声学 CI 模拟时接受测试。CI 受试者在使用其临床处理器和推荐的麦克风灵敏度和音量设置进行测试时接受测试。

结果

NH 组的表现明显优于 CI 模拟组,CI 模拟组的表现明显优于 CI 组。对于所有受试者组,自然语音的句子识别率明显高于合成语音。对于 NH 受试者,在听未经处理的语音时,使用合成语音的表现缺陷相对较小。然而,对于 CI 受试者和 CI 模拟受试者,使用合成语音的表现缺陷要大得多。说话者性别存在显著影响,CI 受试者对女性说话者的表现稍好,CI 模拟受试者对男性说话者的表现稍好。对于所有受试者组,只有在快速语速下句子识别率才显著降低。CI 的表现非常差(约 10%正确),在快速语速下。

结论

CI 听众容易受到语速和发音方式(自然与合成)引起的语音模式变化的影响。在真实听力条件下,使用清晰语音材料评估 CI 的性能可能会高估性能。CI 性能较差可能是由于除了光谱时间分辨率降低之外的其他因素,例如电刺激质量、耳聋持续时间或皮质处理。优化输入或训练可能会提高 CI 用户对语音模式变化的容忍度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/daad/3610785/ded3ab071b7a/nihms409613f1.jpg

相似文献

3
Mandarin Chinese speech recognition by pediatric cochlear implant users.儿童人工耳蜗使用者的普通话语音识别
Int J Pediatr Otorhinolaryngol. 2011 Jun;75(6):793-800. doi: 10.1016/j.ijporl.2011.03.009. Epub 2011 Apr 12.

引用本文的文献

8
Are There Real-world Benefits to Bimodal Listening?双模式听力是否有实际获益?
Otol Neurotol. 2020 Oct;41(9):e1111-e1117. doi: 10.1097/MAO.0000000000002767.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验