言语的细微结构包含支持言语分段的关键时间线索。

Speech fine structure contains critical temporal cues to support speech segmentation.

机构信息

Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, 60322, Germany.

Department of Neurosurgery, Duke University, Durham, NC, USA, 27710.

出版信息

Neuroimage. 2019 Nov 15;202:116152. doi: 10.1016/j.neuroimage.2019.116152. Epub 2019 Sep 1.

DOI:10.1016/j.neuroimage.2019.116152

PMID:31484039

Abstract

Segmenting the continuous speech stream into units for further perceptual and linguistic analyses is fundamental to speech recognition. The speech amplitude envelope (SE) has long been considered a fundamental temporal cue for segmenting speech. Does the temporal fine structure (TFS), a significant part of speech signals often considered to contain primarily spectral information, contribute to speech segmentation? Using magnetoencephalography, we show that the TFS entrains cortical responses between 3 and 6 Hz and demonstrate, using mutual information analysis, that (i) the temporal information in the TFS can be reconstructed from a measure of frame-to-frame spectral change and correlates with the SE and (ii) that spectral resolution is key to the extraction of such temporal information. Furthermore, we show behavioural evidence that, when the SE is temporally distorted, the TFS provides cues for speech segmentation and aids speech recognition significantly. Our findings show that it is insufficient to investigate solely the SE to understand temporal speech segmentation, as the SE and the TFS derived from a band-filtering method convey comparable, if not inseparable, temporal information. We argue for a more synthetic view of speech segmentation - the auditory system groups speech signals coherently in both temporal and spectral domains.

摘要

将连续的语音流分割成单元以进行进一步的感知和语言分析是语音识别的基础。语音幅度包络（SE）长期以来一直被认为是分割语音的基本时域线索。作为语音信号的重要组成部分，其通常被认为主要包含频谱信息的时频结构（TFS）是否有助于语音分割？我们使用脑磁图（MEG）表明 TFS 在 3 到 6Hz 之间引发皮质反应，并通过互信息分析证明：（i）TFS 中的时间信息可以从帧到帧的光谱变化的度量中重建，并且与 SE 相关；（ii）光谱分辨率是提取这种时间信息的关键。此外，我们还提供了行为证据，表明当 SE 受到时间扭曲时，TFS 为语音分割提供线索，并极大地帮助语音识别。我们的研究结果表明，仅研究 SE 不足以理解时间语音分割，因为 SE 和源自带通滤波方法的 TFS 传达了可比的（如果不是不可分割的）时间信息。我们主张更综合的语音分割观点——听觉系统在时域和频域中一致地对语音信号进行分组。

相似文献

Speech fine structure contains critical temporal cues to support speech segmentation.

Neuroimage. 2019 Nov 15;202:116152. doi: 10.1016/j.neuroimage.2019.116152. Epub 2019 Sep 1.

The role of recovered envelope cues in the identification of temporal-fine-structure speech for hearing-impaired listeners.

J Acoust Soc Am. 2015 Jan;137(1):505-8. doi: 10.1121/1.4904540.

Distorting temporal fine structure by phase shifting and its effects on speech intelligibility and neural phase locking.

Sci Rep. 2017 Oct 17;7(1):13387. doi: 10.1038/s41598-017-12975-3.

Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.

J Assoc Res Otolaryngol. 2017 Oct;18(5):687-710. doi: 10.1007/s10162-017-0627-7. Epub 2017 Jul 26.

Role of spectral and temporal cues in restoring missing speech information.

J Acoust Soc Am. 2010 Nov;128(5):EL294-9. doi: 10.1121/1.3501962.

Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.

J Neurosci. 2016 Aug 3;36(31):8250-7. doi: 10.1523/JNEUROSCI.4421-15.2016.

The effects of noise vocoding on speech quality perception.

Hear Res. 2014 Mar;309:75-83. doi: 10.1016/j.heares.2013.11.011. Epub 2013 Dec 11.

Spectro-temporal cues enhance modulation sensitivity in cochlear implant users.

Hear Res. 2017 Aug;351:45-54. doi: 10.1016/j.heares.2017.05.009. Epub 2017 May 26.

Role of short-time acoustic temporal fine structure cues in sentence recognition for normal-hearing listeners.

J Acoust Soc Am. 2018 Feb;143(2):EL127. doi: 10.1121/1.5024817.

Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure.

Neuroimage. 2014 Mar;88:41-6. doi: 10.1016/j.neuroimage.2013.10.054. Epub 2013 Nov 2.

引用本文的文献

Segmenting and Predicting Musical Phrase Structure Exploits Neural Gain Modulation and Phase Precession.

J Neurosci. 2024 Jul 24;44(30):e1331232024. doi: 10.1523/JNEUROSCI.1331-23.2024.

Impaired Cortical Tracking of Speech in Children with Developmental Language Disorder.

J Neurosci. 2024 May 29;44(22):e2048232024. doi: 10.1523/JNEUROSCI.2048-23.2024.

A review of auditory processing and cognitive change during normal ageing, and the implications for setting hearing aids for older adults.

Front Neurol. 2023 Jun 20;14:1122420. doi: 10.3389/fneur.2023.1122420. eCollection 2023.

The common limitations in auditory temporal processing for Mandarin Chinese and Japanese.

Sci Rep. 2022 Feb 22;12(1):3002. doi: 10.1038/s41598-022-06925-x.

Time consciousness: the missing link in theories of consciousness.

Neurosci Conscious. 2021 Apr 12;2021(1):niab011. doi: 10.1093/nc/niab011. eCollection 2021.

Phonemic restoration of interrupted locally time-reversed speech : Effects of segment duration and noise levels.

Atten Percept Psychophys. 2021 Jul;83(5):1928-1934. doi: 10.3758/s13414-021-02292-3. Epub 2021 Apr 13.

Delta/theta band EEG differentially tracks low and high frequency speech-derived envelopes.

Neuroimage. 2021 Jun;233:117958. doi: 10.1016/j.neuroimage.2021.117958. Epub 2021 Mar 17.

Modulation Spectra Capture EEG Responses to Speech Signals and Drive Distinct Temporal Response Functions.

eNeuro. 2021 Jan 14;8(1). doi: 10.1523/ENEURO.0399-20.2020. Print 2021 Jan-Feb.

Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions.

Front Hum Neurosci. 2020 Oct 7;14:557534. doi: 10.3389/fnhum.2020.557534. eCollection 2020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

言语的细微结构包含支持言语分段的关键时间线索。

Speech fine structure contains critical temporal cues to support speech segmentation.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献