• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从人类听觉皮层重建语音。

Reconstructing speech from human auditory cortex.

机构信息

Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, California, United States of America.

出版信息

PLoS Biol. 2012 Jan;10(1):e1001251. doi: 10.1371/journal.pbio.1001251. Epub 2012 Jan 31.

DOI:10.1371/journal.pbio.1001251
PMID:22303281
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3269422/
Abstract

How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.

摘要

人类听觉系统如何提取言语感知相关的声学特征尚不清楚。为了解决这个问题,我们使用人类上颞叶非初级听觉皮层的颅内记录来确定从群体神经活动中可以重建言语声音中的哪些声学信息。我们发现,使用基于听觉声谱图的线性模型可以准确重建缓慢和中频时间波动,例如与音节率相对应的波动。然而,快速时间波动的重建,如音节的起始和结束,需要基于时间调制能量的非线性声音表示。重建精度在被发现对言语可懂度至关重要的声谱一时间波动范围内最高。解码后的言语表示允许在单次试验声音呈现期间直接从大脑活动中读取和识别单个单词。这些发现揭示了人类高级听觉皮层中言语声学参数的神经编码机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/1182ce6a706d/pbio.1001251.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/cd30afe8bc9a/pbio.1001251.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/0bf68241a190/pbio.1001251.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/b60fba490a96/pbio.1001251.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/a2905164a755/pbio.1001251.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/d325e23205db/pbio.1001251.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/8ad7e0571876/pbio.1001251.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/b837b461da5e/pbio.1001251.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/1182ce6a706d/pbio.1001251.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/cd30afe8bc9a/pbio.1001251.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/0bf68241a190/pbio.1001251.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/b60fba490a96/pbio.1001251.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/a2905164a755/pbio.1001251.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/d325e23205db/pbio.1001251.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/8ad7e0571876/pbio.1001251.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/b837b461da5e/pbio.1001251.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c86/3269422/1182ce6a706d/pbio.1001251.g008.jpg

相似文献

1
Reconstructing speech from human auditory cortex.从人类听觉皮层重建语音。
PLoS Biol. 2012 Jan;10(1):e1001251. doi: 10.1371/journal.pbio.1001251. Epub 2012 Jan 31.
2
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.整个外侧裂周皮层对语音低层次特征的神经调谐。
J Neurosci. 2017 Aug 16;37(33):7906-7920. doi: 10.1523/JNEUROSCI.0238-17.2017. Epub 2017 Jul 17.
3
Towards reconstructing intelligible speech from the human auditory cortex.从人类听觉皮层重建可理解的语音。
Sci Rep. 2019 Jan 29;9(1):874. doi: 10.1038/s41598-018-37359-z.
4
Phonetic feature encoding in human superior temporal gyrus.人类上颞回中的语音特征编码。
Science. 2014 Feb 28;343(6174):1006-10. doi: 10.1126/science.1245994. Epub 2014 Jan 30.
5
Speech Computations of the Human Superior Temporal Gyrus.人类上颞叶的言语计算。
Annu Rev Psychol. 2022 Jan 4;73:79-102. doi: 10.1146/annurev-psych-022321-035256. Epub 2021 Oct 21.
6
Brainstem-cortical functional connectivity for speech is differentially challenged by noise and reverberation.语音的脑干-皮质功能连接受到噪声和混响的不同程度挑战。
Hear Res. 2018 Sep;367:149-160. doi: 10.1016/j.heares.2018.05.018. Epub 2018 May 26.
7
Cortical processing of pitch: Model-based encoding and decoding of auditory fMRI responses to real-life sounds.皮层音高处理:基于模型的听觉 fMRI 响应对真实声音的编码和解码。
Neuroimage. 2018 Oct 15;180(Pt A):291-300. doi: 10.1016/j.neuroimage.2017.11.020. Epub 2017 Nov 13.
8
Parallel and distributed encoding of speech across human auditory cortex.人类听觉皮层中语音的并行和分布式编码。
Cell. 2021 Sep 2;184(18):4626-4639.e13. doi: 10.1016/j.cell.2021.07.019. Epub 2021 Aug 18.
9
Decoding speech from spike-based neural population recordings in secondary auditory cortex of non-human primates.从非人类灵长类动物次级听觉皮层的基于尖峰的神经群体记录中解码语音。
Commun Biol. 2019 Dec 11;2:466. doi: 10.1038/s42003-019-0707-9. eCollection 2019.
10
Representation of temporal sound features in the human auditory cortex.人类听觉皮层中时间声音特征的表示。
Rev Neurosci. 2011;22(2):187-203. doi: 10.1515/RNS.2011.016.

引用本文的文献

1
Exploring an EM-algorithm for banded regression in computational neuroscience.探索计算神经科学中带状回归的期望最大化算法。
Imaging Neurosci (Camb). 2024 May 20;2. doi: 10.1162/imag_a_00155. eCollection 2024.
2
Neural signals, machine learning, and the future of inner speech recognition.神经信号、机器学习与内心言语识别的未来
Front Hum Neurosci. 2025 Jul 10;19:1637174. doi: 10.3389/fnhum.2025.1637174. eCollection 2025.
3
Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.

本文引用的文献

1
Representation of temporal sound features in the human auditory cortex.人类听觉皮层中时间声音特征的表示。
Rev Neurosci. 2011;22(2):187-203. doi: 10.1515/RNS.2011.016.
2
Unlocking the role of the superior temporal gyrus for speech sound categorization.揭示颞上回在语音分类中的作用。
J Neurophysiol. 2011 Jun;105(6):2631-3. doi: 10.1152/jn.00238.2011. Epub 2011 Mar 23.
3
Incorporating naturalistic correlation structure improves spectrogram reconstruction from neuronal activity in the songbird auditory midbrain.将自然相关结构纳入其中,可改善鸣禽听觉中脑神经元活动的声谱图重建。
利用深度神经网络表示,可以从人类神经成像数据中重建自然声音。
PLoS Biol. 2025 Jul 23;23(7):e3003293. doi: 10.1371/journal.pbio.3003293. eCollection 2025 Jul.
4
An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation.一个用于语音解码的开放获取脑电图数据集:探索发音和协同发音的作用。
Sci Data. 2025 Jun 17;12(1):1017. doi: 10.1038/s41597-025-05187-2.
5
Predicting artificial neural network representations to learn recognition model for music identification from brain recordings.预测人工神经网络表征以从脑电记录中学习用于音乐识别的识别模型。
Sci Rep. 2025 May 29;15(1):18869. doi: 10.1038/s41598-025-02790-6.
6
VocalMind: A Stereotactic EEG Dataset for Vocalized, Mimed, and Imagined Speech in Tonal Language.VocalMind:一个用于有声、哑剧和想象中的声调语言语音的立体定向脑电图数据集。
Sci Data. 2025 Apr 19;12(1):657. doi: 10.1038/s41597-025-04741-2.
7
Reconstructing Covert Consciousness: Neural Decoding as a Novel Consciousness Assessment.重建隐性意识:作为一种新型意识评估的神经解码
Neurology. 2025 Feb 25;104(4):e210208. doi: 10.1212/WNL.0000000000210208. Epub 2025 Jan 30.
8
Infant low-frequency EEG cortical power, cortical tracking and phase-amplitude coupling predicts language a year later.婴儿低频脑电图皮质功率、皮质追踪和相位-振幅耦合可预测一年后的语言能力。
PLoS One. 2024 Dec 5;19(12):e0313274. doi: 10.1371/journal.pone.0313274. eCollection 2024.
9
Deep-learning models reveal how context and listener attention shape electrophysiological correlates of speech-to-language transformation.深度学习模型揭示了语境和听众注意力如何塑造言语到语言转换的电生理相关性。
PLoS Comput Biol. 2024 Nov 11;20(11):e1012537. doi: 10.1371/journal.pcbi.1012537. eCollection 2024 Nov.
10
FMRI speech tracking in primary and non-primary auditory cortex while listening to noisy scenes.在嘈杂场景下聆听时,对初级和非初级听觉皮层的 fMRI 言语追踪。
Commun Biol. 2024 Sep 30;7(1):1217. doi: 10.1038/s42003-024-06913-z.
J Neurosci. 2011 Mar 9;31(10):3828-42. doi: 10.1523/JNEUROSCI.3256-10.2011.
4
Representation of speech categories in the primate auditory cortex.灵长类听觉皮层中言语类别的表现。
J Neurophysiol. 2011 Jun;105(6):2634-46. doi: 10.1152/jn.00037.2011. Epub 2011 Feb 23.
5
Single-trial speech suppression of auditory cortex activity in humans.人类听觉皮层活动的单次试验语音抑制。
J Neurosci. 2010 Dec 8;30(49):16643-50. doi: 10.1523/JNEUROSCI.1809-10.2010.
6
Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition.脑电高gamma 活动在显性和隐性单词重复过程中的时空动力学。
Neuroimage. 2011 Feb 14;54(4):2960-72. doi: 10.1016/j.neuroimage.2010.10.029. Epub 2010 Oct 26.
7
Categorical speech representation in human superior temporal gyrus.人类上颞叶中的范畴言语表征。
Nat Neurosci. 2010 Nov;13(11):1428-32. doi: 10.1038/nn.2641. Epub 2010 Oct 3.
8
Neural representation of natural images in visual area V2.V2 视觉区中自然图像的神经表示。
J Neurosci. 2010 Feb 10;30(6):2102-14. doi: 10.1523/JNEUROSCI.4099-09.2010.
9
Spatiotemporal imaging of cortical activation during verb generation and picture naming.动词生成和图片命名过程中皮质激活的时空成像。
Neuroimage. 2010 Mar;50(1):291-301. doi: 10.1016/j.neuroimage.2009.12.035. Epub 2009 Dec 21.
10
Temporal envelope of time-compressed speech represented in the human auditory cortex.时间压缩语音在人类听觉皮层中的时间包络表示。
J Neurosci. 2009 Dec 9;29(49):15564-74. doi: 10.1523/JNEUROSCI.3065-09.2009.