• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用自动语音识别和语音合成技术提高人工耳蜗使用者在混响聆听环境中的语音清晰度。

USING AUTOMATIC SPEECH RECOGNITION AND SPEECH SYNTHESIS TO IMPROVE THE INTELLIGIBILITY OF COCHLEAR IMPLANT USERS IN REVERBERANT LISTENING ENVIRONMENTS.

作者信息

Chu Kevin, Collins Leslie, Mainsah Boyla

机构信息

Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:6929-6933. doi: 10.1109/icassp40776.2020.9054450. Epub 2020 May 14.

DOI:10.1109/icassp40776.2020.9054450
PMID:33078056
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7568341/
Abstract

Cochlear implant (CI) users experience substantial difficulties in understanding reverberant speech. A previous study proposed a strategy that leverages automatic speech recognition (ASR) to recognize reverberant speech and speech synthesis to translate the recognized text into anechoic speech. However, the strategy was trained and tested on the same reverberant environment, so it is unknown whether the strategy is robust to unseen environments. Thus, the current study investigated the performance of the previously proposed algorithm in multiple unseen environments. First, an ASR system was trained on anechoic and reverberant speech using different room types. Next, a speech synthesizer was trained to generate speech from the text predicted by the ASR system. Experiments were conducted in normal hearing listeners using vocoded speech, and the results showed that the strategy improved speech intelligibility in previously unseen conditions. These results suggest that the ASR-synthesis strategy can potentially benefit CI users in everyday reverberant environments.

摘要

人工耳蜗(CI)使用者在理解混响语音方面存在很大困难。先前的一项研究提出了一种策略,该策略利用自动语音识别(ASR)来识别混响语音,并利用语音合成将识别出的文本转换为无回声语音。然而,该策略是在相同的混响环境中进行训练和测试的,因此尚不清楚该策略对未见过的环境是否具有鲁棒性。因此,当前的研究调查了先前提出的算法在多个未见过的环境中的性能。首先,使用不同的房间类型在无回声和混响语音上训练一个ASR系统。接下来,训练一个语音合成器,根据ASR系统预测的文本生成语音。使用声码语音在正常听力的听众中进行了实验,结果表明该策略在以前未见过的条件下提高了语音清晰度。这些结果表明,ASR合成策略可能会使人工耳蜗使用者在日常混响环境中受益。

相似文献

1
USING AUTOMATIC SPEECH RECOGNITION AND SPEECH SYNTHESIS TO IMPROVE THE INTELLIGIBILITY OF COCHLEAR IMPLANT USERS IN REVERBERANT LISTENING ENVIRONMENTS.利用自动语音识别和语音合成技术提高人工耳蜗使用者在混响聆听环境中的语音清晰度。
Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:6929-6933. doi: 10.1109/icassp40776.2020.9054450. Epub 2020 May 14.
2
USING MACHINE LEARNING TO MITIGATE THE EFFECTS OF REVERBERATION AND NOISE IN COCHLEAR IMPLANTS.利用机器学习减轻人工耳蜗中的混响和噪声影响。
Proc Meet Acoust. 2018 May 7;33(1). doi: 10.1121/2.0000905. Epub 2018 Oct 8.
3
Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants.为正常听力听众测量混响语音编码语音的客观可懂度:促进人工耳蜗语音增强算法的发展。
J Acoust Soc Am. 2024 Mar 1;155(3):2151-2168. doi: 10.1121/10.0025285.
4
Prior exposure to a reverberant listening environment improves speech intelligibility in adult cochlear implant listeners.先前暴露于混响聆听环境可提高成人人工耳蜗佩戴者的言语可懂度。
Cochlear Implants Int. 2016;17(2):98-104. doi: 10.1080/14670100.2015.1102455. Epub 2016 Feb 5.
5
A CAUSAL DEEP LEARNING FRAMEWORK FOR CLASSIFYING PHONEMES IN COCHLEAR IMPLANTS.一种用于对人工耳蜗中的音素进行分类的因果深度学习框架。
Proc IEEE Int Conf Acoust Speech Signal Process. 2021 Jun;2021:6498-6502. doi: 10.1109/icassp39728.2021.9413986. Epub 2021 May 13.
6
Reverberation suppression in cochlear implants using a blind channel-selection strategy.使用盲通道选择策略抑制人工耳蜗中的回声。
J Acoust Soc Am. 2013 Jun;133(6):4188-96. doi: 10.1121/1.4804313.
7
Parameter tuning of time-frequency masking algorithms for reverberant artifact removal within the cochlear implant stimulus.参数调整的时频掩蔽算法的混响伪影去除在耳蜗植入刺激。
Cochlear Implants Int. 2022 Nov;23(6):309-316. doi: 10.1080/14670100.2022.2096182. Epub 2022 Jul 23.
8
Effects of source-to-listener distance and masking on perception of cochlear implant processed speech in reverberant rooms.声源至听者距离和掩蔽效应对混响室内人工耳蜗处理语音感知的影响。
J Acoust Soc Am. 2009 Nov;126(5):2556-69. doi: 10.1121/1.3216912.
9
A procedure for testing speech intelligibility in a virtual listening environment.一种在虚拟聆听环境中测试言语可懂度的程序。
Ear Hear. 1996 Jun;17(3):211-7. doi: 10.1097/00003446-199606000-00004.
10
The influence of audiovisual ceiling performance on the relationship between reverberation and directional benefit: perception and prediction.视听上限性能对混响和方向增益关系的影响:感知与预测。
Ear Hear. 2012 Sep-Oct;33(5):604-14. doi: 10.1097/AUD.0b013e31825641e4.

引用本文的文献

1
A novel silent speech recognition approach based on parallel inception convolutional neural network and Mel frequency spectral coefficient.一种基于并行初始卷积神经网络和梅尔频率谱系数的新型无声语音识别方法。
Front Neurorobot. 2022 Sep 2;16:971446. doi: 10.3389/fnbot.2022.971446. eCollection 2022.
2
A CAUSAL DEEP LEARNING FRAMEWORK FOR CLASSIFYING PHONEMES IN COCHLEAR IMPLANTS.一种用于对人工耳蜗中的音素进行分类的因果深度学习框架。
Proc IEEE Int Conf Acoust Speech Signal Process. 2021 Jun;2021:6498-6502. doi: 10.1109/icassp39728.2021.9413986. Epub 2021 May 13.

本文引用的文献

1
The combined effects of reverberation and noise on speech intelligibility by cochlear implant listeners.混响和噪声对人工耳蜗使用者言语可懂度的综合影响。
Int J Audiol. 2012 Jun;51(6):437-43. doi: 10.3109/14992027.2012.658972. Epub 2012 Feb 22.
2
Trends in cochlear implants.人工耳蜗植入的发展趋势。
Trends Amplif. 2004;8(1):1-34. doi: 10.1177/108471380400800102.
3
Speech perception as a function of electrical stimulation rate: using the Nucleus 24 cochlear implant system.作为电刺激速率函数的言语感知:使用核24型人工耳蜗系统
Ear Hear. 2000 Dec;21(6):608-24. doi: 10.1097/00003446-200012000-00008.
4
Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise.用于测量安静和噪声环境下言语接受阈值的噪声中听力测试的开发。
J Acoust Soc Am. 1994 Feb;95(2):1085-99. doi: 10.1121/1.408469.
5
A "rationalized" arcsine transform.一种“合理化”反正弦变换。
J Speech Hear Res. 1985 Sep;28(3):455-62. doi: 10.1044/jshr.2803.455.