• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过使用早期经验中从刺激派生的响应场来表示相似性,实现稳健的声音分类。

Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience.

作者信息

Coath Martin, Denham Susan L

机构信息

Centre for Theoretical and Computational Neuroscience, University of Plymouth, Drakes Circus PL4 8AA, UK.

出版信息

Biol Cybern. 2005 Jul;93(1):22-30. doi: 10.1007/s00422-005-0560-4. Epub 2005 Jun 8.

DOI:10.1007/s00422-005-0560-4
PMID:15944856
Abstract

Models of auditory processing, particularly of speech, face many difficulties. Included in these are variability among speakers, variability in speech rate, and robustness to moderate distortions such as time compression. We constructed a system based on ensembles of feature detectors derived from fragments of an onset-sensitive sound representation. This method is based on the idea of 'spectro-temporal response fields' and uses convolution to measure the degree of similarity through time between the feature detectors and the stimulus. The output from the ensemble was used to derive segmentation cues and patterns of response, which were used to train an artificial neural network (ANN) classifier. This allowed us to estimate a lower bound for the mutual information between the class of the input and the class of the output. Our results suggest that there is significant information in the output of our system, and that this is robust with respect to the exact choice of feature set, time compression in the stimulus, and speaker variation. In addition, the robustness to time compression in the stimulus has features in common with human psychophysics. Similar experiments using feature detectors derived from fragments of non-speech sounds performed less well. This result is interesting in the light of results showing aberrant cortical development in animals exposed to impoverished auditory environments during the developmental phase.

摘要

听觉处理模型,尤其是语音处理模型,面临着诸多困难。其中包括说话者之间的差异、语速的变化以及对诸如时间压缩等适度失真的鲁棒性。我们构建了一个基于从起始敏感声音表示的片段中派生的特征检测器集合的系统。该方法基于“光谱 - 时间响应场”的概念,并使用卷积来测量特征检测器与刺激之间随时间的相似程度。集合的输出用于导出分割线索和响应模式,这些被用于训练人工神经网络(ANN)分类器。这使我们能够估计输入类别与输出类别之间互信息的下限。我们的结果表明,我们系统的输出中存在大量信息,并且对于特征集的精确选择、刺激中的时间压缩以及说话者变化而言,该信息具有鲁棒性。此外,对刺激中时间压缩的鲁棒性具有与人类心理物理学相同的特征。使用从非语音声音片段派生的特征检测器进行的类似实验表现较差。鉴于有结果表明在发育阶段暴露于贫困听觉环境的动物存在异常的皮层发育,这一结果很有趣。

相似文献

1
Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience.通过使用早期经验中从刺激派生的响应场来表示相似性,实现稳健的声音分类。
Biol Cybern. 2005 Jul;93(1):22-30. doi: 10.1007/s00422-005-0560-4. Epub 2005 Jun 8.
2
Multiple views of the response of an ensemble of spectro-temporal features support concurrent classification of utterance, prosody, sex and speaker identity.光谱-时间特征集合响应的多视角支持话语、韵律、性别和说话者身份的并发分类。
Network. 2005 Jun-Sep;16(2-3):285-300. doi: 10.1080/09548980500290120.
3
Representation of spectrotemporal sound information in the ascending auditory pathway.听觉上行通路中频谱-时间声音信息的表征。
Biol Cybern. 2003 Nov;89(5):350-62. doi: 10.1007/s00422-003-0440-8. Epub 2003 Dec 4.
4
Segmental processing in the human auditory dorsal stream.人类听觉背侧通路中的分段处理
Brain Res. 2008 Jul 18;1220:179-90. doi: 10.1016/j.brainres.2007.11.013. Epub 2007 Nov 17.
5
Encoding of pitch in the human brainstem is sensitive to language experience.人类脑干中音调的编码对语言经验敏感。
Brain Res Cogn Brain Res. 2005 Sep;25(1):161-8. doi: 10.1016/j.cogbrainres.2005.05.004.
6
Selective tuning of cortical sound-feature processing by language experience.通过语言经验对皮层声音特征处理进行选择性调整。
Eur J Neurosci. 2006 May;23(9):2538-41. doi: 10.1111/j.1460-9568.2006.04752.x.
7
The linearity of emergent spectro-temporal receptive fields in a model of auditory cortex.听觉皮层模型中涌现的光谱-时间感受野的线性
Biosystems. 2008 Oct-Nov;94(1-2):60-7. doi: 10.1016/j.biosystems.2008.05.011. Epub 2008 Jun 20.
8
The representation of noise vocoded speech in the auditory nerve of the chinchilla: physiological correlates of the perception of spectrally reduced speech.豚鼠听觉神经中噪声声码器语音的表征:频谱简化语音感知的生理关联
Hear Res. 2006 Mar;213(1-2):130-44. doi: 10.1016/j.heares.2006.01.011. Epub 2006 Feb 23.
9
Should spikes be treated with equal weightings in the generation of spectro-temporal receptive fields?在生成频谱-时间感受野时,尖峰应该被赋予相等的权重吗?
J Physiol Paris. 2010 May-Sep;104(3-4):215-22. doi: 10.1016/j.jphysparis.2009.11.026. Epub 2009 Nov 23.
10
Speaker normalization using cortical strip maps: a neural model for steady-state vowel categorization.使用皮质带图的说话者归一化:一种用于稳态元音分类的神经模型。
J Acoust Soc Am. 2008 Dec;124(6):3918-36. doi: 10.1121/1.2997478.

引用本文的文献

1
Matching Pursuit Analysis of Auditory Receptive Fields' Spectro-Temporal Properties.听觉感受野的频谱-时间特性的匹配追踪分析
Front Syst Neurosci. 2017 Feb 9;11:4. doi: 10.3389/fnsys.2017.00004. eCollection 2017.
2
Using naturalistic utterances to investigate vocal communication processing and development in human and non-human primates.使用自然语言来研究人类和非人类灵长类动物的言语交际处理和发展。
Hear Res. 2013 Nov;305:74-85. doi: 10.1016/j.heares.2013.08.009. Epub 2013 Aug 29.