• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从自然声音统计中学习中级听觉编码。

Learning Midlevel Auditory Codes from Natural Sound Statistics.

作者信息

Młynarski Wiktor, McDermott Josh H

机构信息

Department of Brain and Cognitive Sciences, MIT, Cambridge, MA

出版信息

Neural Comput. 2018 Mar;30(3):631-669. doi: 10.1162/neco_a_01048. Epub 2017 Dec 8.

DOI:10.1162/neco_a_01048
PMID:29220308
Abstract

Interaction with the world requires an organism to transform sensory signals into representations in which behaviorally meaningful properties of the environment are made explicit. These representations are derived through cascades of neuronal processing stages in which neurons at each stage recode the output of preceding stages. Explanations of sensory coding may thus involve understanding how low-level patterns are combined into more complex structures. To gain insight into such midlevel representations for sound, we designed a hierarchical generative model of natural sounds that learns combinations of spectrotemporal features from natural stimulus statistics. In the first layer, the model forms a sparse convolutional code of spectrograms using a dictionary of learned spectrotemporal kernels. To generalize from specific kernel activation patterns, the second layer encodes patterns of time-varying magnitude of multiple first-layer coefficients. When trained on corpora of speech and environmental sounds, some second-layer units learned to group similar spectrotemporal features. Others instantiate opponency between distinct sets of features. Such groupings might be instantiated by neurons in the auditory cortex, providing a hypothesis for midlevel neuronal computation.

摘要

与世界的互动要求生物体将感官信号转化为表征,在这些表征中,环境中具有行为意义的属性得以明确呈现。这些表征是通过神经元处理阶段的级联推导出来的,其中每个阶段的神经元都会对前一阶段的输出进行重新编码。因此,对感官编码的解释可能涉及理解低级模式是如何组合成更复杂的结构的。为了深入了解声音的这种中级表征,我们设计了一种自然声音的分层生成模型,该模型从自然刺激统计数据中学习频谱时间特征的组合。在第一层,模型使用一组学习到的频谱时间核构建频谱图的稀疏卷积码。为了从特定的核激活模式进行泛化,第二层对多个第一层系数的时变幅度模式进行编码。当在语音和环境声音语料库上进行训练时,一些第二层单元学会了对相似的频谱时间特征进行分组。其他单元则实例化了不同特征集之间的对立关系。这种分组可能由听觉皮层中的神经元实例化,为中级神经元计算提供了一个假设。

相似文献

1
Learning Midlevel Auditory Codes from Natural Sound Statistics.从自然声音统计中学习中级听觉编码。
Neural Comput. 2018 Mar;30(3):631-669. doi: 10.1162/neco_a_01048. Epub 2017 Dec 8.
2
Sparse codes of harmonic natural sounds and their modulatory interactions.谐波自然声音的稀疏编码及其调制相互作用。
Network. 2009;20(4):253-67. doi: 10.3109/09548980903447751.
3
Incorporating Midbrain Adaptation to Mean Sound Level Improves Models of Auditory Cortical Processing.纳入中脑对平均声级的适应性可改善听觉皮层处理模型。
J Neurosci. 2016 Jan 13;36(2):280-9. doi: 10.1523/JNEUROSCI.2441-15.2016.
4
Sound categories are represented as distributed patterns in the human auditory cortex.声音类别在人类听觉皮层中以分布式模式呈现。
Curr Biol. 2009 Mar 24;19(6):498-502. doi: 10.1016/j.cub.2009.01.066. Epub 2009 Mar 5.
5
Emergent categorical representation of natural, complex sounds resulting from the early post-natal sound environment.由出生后早期声音环境产生的自然、复杂声音的紧急分类表征。
Neuroscience. 2013 Sep 17;248:30-42. doi: 10.1016/j.neuroscience.2013.05.056. Epub 2013 Jun 6.
6
ARTSTREAM: a neural network model of auditory scene analysis and source segregation.ARTSTREAM:一种用于听觉场景分析和声源分离的神经网络模型。
Neural Netw. 2004 May;17(4):511-36. doi: 10.1016/j.neunet.2003.10.002.
7
Responses of auditory-cortex neurons to structural features of natural sounds.听觉皮层神经元对自然声音结构特征的反应。
Nature. 1999 Jan 14;397(6715):154-7. doi: 10.1038/16456.
8
Neural coding strategies in auditory cortex.听觉皮层中的神经编码策略。
Hear Res. 2007 Jul;229(1-2):81-93. doi: 10.1016/j.heares.2007.01.019. Epub 2007 Jan 25.
9
Enhanced sound perception by widespread-onset neuronal responses in auditory cortex.听觉皮层中广泛起始的神经元反应增强声音感知。
Neural Comput. 2007 Dec;19(12):3310-34. doi: 10.1162/neco.2007.19.12.3310.
10
Input-Specific Gain Modulation by Local Sensory Context Shapes Cortical and Thalamic Responses to Complex Sounds.局部感觉环境对输入的特定增益调制塑造了皮层和丘脑对复杂声音的反应。
Neuron. 2016 Jul 20;91(2):467-81. doi: 10.1016/j.neuron.2016.05.041. Epub 2016 Jun 23.

引用本文的文献

1
Sparse high-dimensional decomposition of non-primary auditory cortical receptive fields.非初级听觉皮层感受野的稀疏高维分解
PLoS Comput Biol. 2025 Jan 2;21(1):e1012721. doi: 10.1371/journal.pcbi.1012721. eCollection 2025 Jan.
2
Interference of mid-level sound statistics underlie human speech recognition sensitivity in natural noise.中等水平声音统计信息的干扰是自然噪声中人类语音识别敏感性的基础。
bioRxiv. 2024 Oct 4:2024.02.13.579526. doi: 10.1101/2024.02.13.579526.
3
Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions.
许多(但不是全部)深度神经网络音频模型可以捕捉大脑反应,并在模型阶段和大脑区域之间表现出对应关系。
PLoS Biol. 2023 Dec 13;21(12):e3002366. doi: 10.1371/journal.pbio.3002366. eCollection 2023 Dec.
4
Quantitative models of auditory cortical processing.听觉皮层处理的定量模型。
Hear Res. 2023 Mar 1;429:108697. doi: 10.1016/j.heares.2023.108697. Epub 2023 Jan 14.
5
[Evolution of auditory response signal-to-noise ratio in ascending auditory pathways].[听觉上行通路中听觉反应信噪比的演变]
Nan Fang Yi Ke Da Xue Xue Bao. 2021 Nov 20;41(11):1712-1718. doi: 10.12122/j.issn.1673-4254.2021.11.17.
6
Time-dependent discrimination advantages for harmonic sounds suggest efficient coding for memory.时变的谐波声音辨别优势表明了其在记忆中的高效编码。
Proc Natl Acad Sci U S A. 2020 Dec 15;117(50):32169-32180. doi: 10.1073/pnas.2008956117. Epub 2020 Dec 1.
7
Fine-grained statistical structure of speech.言语的精细统计结构。
PLoS One. 2020 Mar 20;15(3):e0230233. doi: 10.1371/journal.pone.0230233. eCollection 2020.
8
Ecological origins of perceptual grouping principles in the auditory system.听觉系统中知觉分组原则的生态起源。
Proc Natl Acad Sci U S A. 2019 Dec 10;116(50):25355-25364. doi: 10.1073/pnas.1903887116. Epub 2019 Nov 21.
9
Simple Acoustic Features Can Explain Phoneme-Based Predictions of Cortical Responses to Speech.简单的声学特征可以解释基于音素的皮质反应对语音的预测。
Curr Biol. 2019 Jun 17;29(12):1924-1937.e9. doi: 10.1016/j.cub.2019.04.067. Epub 2019 May 23.
10
Cascaded Tuning to Amplitude Modulation for Natural Sound Recognition.级联调幅对自然声音识别。
J Neurosci. 2019 Jul 10;39(28):5517-5533. doi: 10.1523/JNEUROSCI.2914-18.2019. Epub 2019 May 15.