• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

噪声编码语音的可懂度:跨通道幅度包络比较可获得的频谱信息。

The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes.

机构信息

Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK.

出版信息

Proc Biol Sci. 2011 May 22;278(1711):1595-600. doi: 10.1098/rspb.2010.1554. Epub 2010 Nov 10.

DOI:10.1098/rspb.2010.1554
PMID:21068039
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3081737/
Abstract

Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the formant frequencies in the vocal-tract output-a key source of phonetic detail-from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher formants (F3' ≈ N5 + N6), such that the frequency contour of each formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying formants.

摘要

噪声编码(NV)语音通常被认为主要通过时域包络线索而不是频谱线索来传达语音信息。然而,当语音通过少量通道进行处理时,听众可能会从幅度的跨频带差异中推断出声道输出中的共振峰频率——这是语音细节的关键来源。评估了这种谱信息在 NV 语音中的潜在效用,方法是将句子过滤成六个频带,并使用每个频带的幅度包络(≤30 Hz)来调制匹配的噪声频带载波(N)。频带配对,对应于 F1(≈N1 + N2)、F2(≈N3 + N4)和较高的共振峰(F3' ≈ N5 + N6),使得每个共振峰的频率轮廓由相应对中各频带之间的相对幅度变化暗示。使用每个共振峰的频率和幅度的逐帧重建,合成了 NV 刺激的三共振峰模拟(F0 = 150 Hz)。这些模拟的可懂度不如 NV 刺激或从原始句子的语谱图中提取的轮廓创建的模拟,但比用恒定(均值)值替换频率轮廓时更可懂。NV 语音中幅度包络的跨频带比较可以提供有关基础共振峰频率轮廓的语音重要信息。

相似文献

1
The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes.噪声编码语音的可懂度:跨通道幅度包络比较可获得的频谱信息。
Proc Biol Sci. 2011 May 22;278(1711):1595-600. doi: 10.1098/rspb.2010.1554. Epub 2010 Nov 10.
2
Human Frequency Following Responses to Vocoded Speech.人类对语音编码语音的频率跟随反应。
Ear Hear. 2017 Sep/Oct;38(5):e256-e267. doi: 10.1097/AUD.0000000000000432.
3
Informational masking and the effects of differences in fundamental frequency and fundamental-frequency contour on phonetic integration in a formant ensemble.信息掩蔽以及共振峰组合中基频和基频轮廓差异对语音整合的影响。
Hear Res. 2017 Feb;344:295-303. doi: 10.1016/j.heares.2016.10.026. Epub 2016 Nov 1.
4
Formant-frequency variation and its effects on across-formant grouping in speech perception.共振峰频率变化及其对言语感知中跨共振峰组合的影响。
Adv Exp Med Biol. 2013;787:323-31. doi: 10.1007/978-1-4614-1590-9_36.
5
Frequency specificity of amplitude envelope patterns in noise-vocoded speech.噪声编码语音中幅度包络模式的频率特异性
Hear Res. 2018 Sep;367:169-181. doi: 10.1016/j.heares.2018.06.005. Epub 2018 Jun 15.
6
Effects of differences in fundamental frequency on across-formant grouping in speech perception.基频差异对言语感知中跨频带组合的影响。
J Acoust Soc Am. 2010 Dec;128(6):3667-77. doi: 10.1121/1.3505119.
7
Acoustic source characteristics, across-formant integration, and speech intelligibility under competitive conditions.竞争条件下的声源特性、跨共振峰整合与言语可懂度
J Exp Psychol Hum Percept Perform. 2015 Jun;41(3):680-91. doi: 10.1037/xhp0000038. Epub 2015 Mar 9.
8
Informational masking of monaural target speech by a single contralateral formant.单对侧共振峰对单耳目标语音的信息掩蔽
J Acoust Soc Am. 2015 May;137(5):2726-36. doi: 10.1121/1.4919344.
9
The effects of the addition of low-level, low-noise noise on the intelligibility of sentences processed to remove temporal envelope information.添加低水平、低噪声对去除时间包络信息后的句子可懂度的影响。
J Acoust Soc Am. 2010 Oct;128(4):2150-61. doi: 10.1121/1.3478773.
10
Effects of the rate of formant-frequency variation on the grouping of formants in speech perception.共振峰频率变化率对言语感知中共振峰分组的影响。
J Assoc Res Otolaryngol. 2012 Apr;13(2):269-280. doi: 10.1007/s10162-011-0307-y. Epub 2011 Dec 13.

引用本文的文献

1
Spectral degradation and carrier sentences increase age-related temporal processing deficits in a cue-specific manner.光谱降解和载体句以特定于提示的方式增加与年龄相关的时间处理缺陷。
J Acoust Soc Am. 2024 Jun 1;155(6):3983-3994. doi: 10.1121/10.0026434.
2
Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses.使用具有连续言语诱发皮层听觉反应的深度学习模型进行客观言语可懂度预测。
Front Neurosci. 2022 Aug 18;16:906616. doi: 10.3389/fnins.2022.906616. eCollection 2022.
3
Speech intelligibility and talker gender classification with noise-vocoded and tone-vocoded speech.基于噪声声码和音调声码语音的语音清晰度及说话者性别分类
JASA Express Lett. 2021 Sep;1(9):094401. doi: 10.1121/10.0006285. Epub 2021 Sep 20.
4
Rapid computations of spectrotemporal prediction error support perception of degraded speech.频谱时间预测误差的快速计算有助于对退化语音的感知。
Elife. 2020 Nov 4;9:e58077. doi: 10.7554/eLife.58077.
5
Multivoxel codes for representing and integrating acoustic features in human cortex.多体素编码用于在人类大脑皮层中表示和整合声学特征。
Neuroimage. 2020 Aug 15;217:116661. doi: 10.1016/j.neuroimage.2020.116661. Epub 2020 Feb 17.
6
Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise.用于噪声环境下人耳和自动语音识别的频谱和时域包络线索。
J Assoc Res Otolaryngol. 2020 Feb;21(1):73-87. doi: 10.1007/s10162-019-00737-z. Epub 2019 Nov 22.
7
Arrays of rectangular subcritical speech bands: Intelligibility improved by noise-vocoding and expanding to critical bandwidths.矩形亚临界语音频段阵列:通过噪声编码和扩展到临界带宽来提高可懂度。
J Acoust Soc Am. 2018 Apr;143(4):EL305. doi: 10.1121/1.5034170.
8
The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults.认知因素对年轻人和老年人理解噪声编码语音个体差异的影响
Front Hum Neurosci. 2017 Jun 7;11:294. doi: 10.3389/fnhum.2017.00294. eCollection 2017.
9
An acoustic key to eight languages/dialects: Factor analyses of critical-band-filtered speech.八种语言/方言的声学关键:临界频带滤波语音的因子分析。
Sci Rep. 2017 Feb 15;7:42468. doi: 10.1038/srep42468.
10
Acoustic source characteristics, across-formant integration, and speech intelligibility under competitive conditions.竞争条件下的声源特性、跨共振峰整合与言语可懂度
J Exp Psychol Hum Percept Perform. 2015 Jun;41(3):680-91. doi: 10.1037/xhp0000038. Epub 2015 Mar 9.

本文引用的文献

1
The perceptual organization of sine-wave speech under competitive conditions.正弦波语音在竞争条件下的知觉组织。
J Acoust Soc Am. 2010 Aug;128(2):804-17. doi: 10.1121/1.3445786.
2
Inferior frontal gyrus activation predicts individual differences in perceptual learning of cochlear-implant simulations.下额前回的激活可预测人工耳蜗模拟知觉学习个体差异。
J Neurosci. 2010 May 26;30(21):7179-86. doi: 10.1523/JNEUROSCI.4040-09.2010.
3
The importance of temporal fine structure information in speech at different spectral regions for normal-hearing and hearing-impaired subjects.不同频谱区域语音中时间精细结构信息对正常听力和听力障碍受试者的重要性。
J Acoust Soc Am. 2010 Mar;127(3):1595-608. doi: 10.1121/1.3293003.
4
Recognition of interrupted sentences under conditions of spectral degradation.在频谱降级条件下识别中断的句子。
J Acoust Soc Am. 2010 Feb;127(2):EL37-41. doi: 10.1121/1.3284544.
5
Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.辅音地标对模拟电声听觉中的语音识别的贡献。
Ear Hear. 2010 Apr;31(2):259-67. doi: 10.1097/AUD.0b013e3181c7db17.
6
Transfer of auditory perceptual learning with spectrally reduced speech to speech and nonspeech tasks: implications for cochlear implants.听觉感知学习的频谱减缩言语向言语和非言语任务的转移:对人工耳蜗的启示。
Ear Hear. 2009 Dec;30(6):662-74. doi: 10.1097/AUD.0b013e3181b9c92d.
7
Children discover the spectral skeletons in their native language before the amplitude envelopes.儿童在掌握母语的振幅包络之前,就发现了其频谱骨架。
J Exp Psychol Hum Percept Perform. 2009 Aug;35(4):1245-53. doi: 10.1037/a0015020.
8
Effects of envelope bandwidth on the intelligibility of sine- and noise-vocoded speech.包络带宽对正弦和噪声声码语音可懂度的影响。
J Acoust Soc Am. 2009 Aug;126(2):792-805. doi: 10.1121/1.3158835.
9
Factors affecting masking release in cochlear-implant vocoded speech.影响人工耳蜗编码语音中掩蔽释放的因素。
J Acoust Soc Am. 2009 Jul;126(1):338-46. doi: 10.1121/1.3133702.
10
On the number of auditory filter outputs needed to understand speech: further evidence for auditory channel independence.关于理解语音所需的听觉滤波器输出数量:听觉通道独立性的进一步证据。
Hear Res. 2009 Sep;255(1-2):99-108. doi: 10.1016/j.heares.2009.06.005. Epub 2009 Jun 16.