• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于耳蜗模型的非线性谱时特征在噪声环境下的自动语音识别。

Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation.

机构信息

Department of Electrical Engineering and Brain Science Research Center, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong Yuseong-gu, Daejeon 305-701, Republic of Korea.

出版信息

Neural Netw. 2013 Sep;45:62-9. doi: 10.1016/j.neunet.2013.02.006. Epub 2013 Mar 7.

DOI:10.1016/j.neunet.2013.02.006
PMID:23558292
Abstract

A nonlinear speech feature extraction algorithm was developed by modeling human cochlear functions, and demonstrated as a noise-robust front-end for speech recognition systems. The algorithm was based on a model of the Organ of Corti in the human cochlea with such features as such as basilar membrane (BM), outer hair cells (OHCs), and inner hair cells (IHCs). Frequency-dependent nonlinear compression and amplification of OHCs were modeled by lateral inhibition to enhance spectral contrasts. In particular, the compression coefficients had frequency dependency based on the psychoacoustic evidence. Spectral subtraction and temporal adaptation were applied in the time-frame domain. With long-term and short-term adaptation characteristics, these factors remove stationary or slowly varying components and amplify the temporal changes such as onset or offset. The proposed features were evaluated with a noisy speech database and showed better performance than the baseline methods such as mel-frequency cepstral coefficients (MFCCs) and RASTA-PLP in unknown noisy conditions.

摘要

一种基于人类耳蜗功能建模的非线性语音特征提取算法被开发出来,并被证明是一种抗噪的语音识别系统前端。该算法基于人耳蜗中的 Corti 器官模型,具有基底膜 (BM)、外毛细胞 (OHCs) 和内毛细胞 (IHCs) 等特征。通过侧向抑制来增强频谱对比度,对 OHCs 的频率相关非线性压缩和放大进行建模。特别是,根据心理声学证据,压缩系数具有频率依赖性。在时间域中应用频谱相减和时间自适应。通过长期和短期自适应特性,这些因素去除静止或缓慢变化的分量,并放大诸如起始或结束的时间变化。在所提出的特征中,通过噪声语音数据库进行了评估,并在未知噪声条件下显示出比基线方法(如梅尔频率倒谱系数 (MFCC) 和 RASTA-PLP)更好的性能。

相似文献

1
Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation.基于耳蜗模型的非线性谱时特征在噪声环境下的自动语音识别。
Neural Netw. 2013 Sep;45:62-9. doi: 10.1016/j.neunet.2013.02.006. Epub 2013 Mar 7.
2
Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition.用于鲁棒自动语音识别的时频谱调制子空间扩展滤波器组特征。
J Acoust Soc Am. 2012 May;131(5):4134-51. doi: 10.1121/1.3699200.
3
Spectro-temporal modulation energy based mask for robust speaker identification.基于谱时调制能量的掩蔽稳健说话人识别。
J Acoust Soc Am. 2012 May;131(5):EL368-74. doi: 10.1121/1.3697534.
4
An improved speech processing strategy for cochlear implants based on an active nonlinear filterbank model of the biological cochlea.一种基于生物耳蜗主动非线性滤波器组模型的人工耳蜗语音处理改进策略。
IEEE Trans Biomed Eng. 2009 Mar;56(3):828-36. doi: 10.1109/TBME.2008.2007850. Epub 2008 Oct 31.
5
A model of auditory perception as front end for automatic speech recognition.一种作为自动语音识别前端的听觉感知模型。
J Acoust Soc Am. 1999 Oct;106(4 Pt 1):2040-50. doi: 10.1121/1.427950.
6
A study of hearing aid gain functions based on a nonlinear nonlocal feedforward cochlea model.基于非线性非局部前馈耳蜗模型的助听器增益功能研究。
Hear Res. 2006 May;215(1-2):84-96. doi: 10.1016/j.heares.2006.03.013. Epub 2006 May 6.
7
Auditory-model based robust feature selection for speech recognition.基于听觉模型的语音识别鲁棒特征选择。
J Acoust Soc Am. 2010 Feb;127(2):EL73-9. doi: 10.1121/1.3284545.
8
Do we need STRFs for cocktail parties? On the relevance of physiologically motivated features for human speech perception derived from automatic speech recognition.我们在鸡尾酒会上需要 STRFs 吗?关于自动语音识别中提取的基于生理学的特征对人类语音感知的相关性。
Adv Exp Med Biol. 2013;787:333-41. doi: 10.1007/978-1-4614-1590-9_37.
9
Dynamic formant tracking of noisy speech using temporal analysis on outputs from a nonlinear cochlear model.基于非线性耳蜗模型输出的时间分析对噪声语音进行动态共振峰跟踪
IEEE Trans Biomed Eng. 1993 May;40(5):456-67. doi: 10.1109/10.243416.
10
Cochlea-inspired speech recognition interface.耳蜗启发式语音识别接口。
Med Biol Eng Comput. 2019 Jun;57(6):1393-1403. doi: 10.1007/s11517-019-01963-6. Epub 2019 Mar 4.