• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

声调包络线索在中国语音识别中的重要性。

Importance of tonal envelope cues in Chinese speech recognition.

作者信息

Fu Q J, Zeng F G, Shannon R V, Soli S D

机构信息

Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.

出版信息

J Acoust Soc Am. 1998 Jul;104(1):505-10. doi: 10.1121/1.423251.

DOI:10.1121/1.423251
PMID:9670541
Abstract

Recent studies have shown that temporal waveform envelope cues can provide significant information for English speech recognition. This study investigated the use of temporal envelope cues in a tonal language: Mandarin Chinese. In this study, the speech was divided into several frequency analysis bands; the amplitude envelope was extracted from each band by half-wave rectification and low-pass filtering and was used to modulate a noise of the same bandwidth as the analysis band. These manipulations preserved temporal and amplitude cues in each frequency band, but removed the spectral detail within each band. Chinese vowels, consonants, tones and sentences were identified by 12 native Chinese-speaking listeners with 1, 2, 3, and 4 noise bands. The results showed that the recognition score of vowels, consonants, and sentences increased monotonically with the number of bands, a pattern similar to that observed in English speech recognition. In contrast, tones were consistently recognized at about 80% correct level, independent of the number of bands. This high level of tone recognition produced a significant difference in the open-set sentence recognition between Chinese (11.0%) and English (2.9%) for the one-band condition where no spectral information was available. The data also revealed that, with primarily temporal cues, the falling-rising tone (tone 3) and the falling tone (tone 4) were more easily recognized than the flat tone (tone 1) and the rising tone (tone 2). This differential pattern in tone recognition resulted in a similar pattern in word recognition: words having either tone 3 or 4 were more likely to be recognized while words having tone 1 and 2 were not. The quantitative role of tones in Chinese speech recognition was further explored using a power-function model and found to play a significant role in relating phoneme recognition to sentence recognition.

摘要

最近的研究表明,时间波形包络线索可为英语语音识别提供重要信息。本研究调查了声调语言(汉语普通话)中时间包络线索的使用情况。在本研究中,语音被划分为几个频率分析频段;通过半波整流和低通滤波从每个频段提取幅度包络,并用于调制与分析频段带宽相同的噪声。这些操作保留了每个频段的时间和幅度线索,但去除了每个频段内的频谱细节。12名以汉语为母语的听众分别在有1、2、3和4个噪声频段的情况下对汉语元音、辅音、声调及句子进行识别。结果表明,元音、辅音和句子的识别分数随频段数量单调增加,这一模式与英语语音识别中观察到的相似。相比之下,声调的正确识别率始终保持在约80%的水平,与频段数量无关。在单频段条件下(即没有频谱信息),这种高水平的声调识别导致汉语(11.0%)和英语(2.9%)在开放集句子识别上存在显著差异。数据还显示,在主要依靠时间线索的情况下,上声(三声)和去声(四声)比阴平(一声)和阳平(二声)更容易识别。这种声调识别中的差异模式在单词识别中也呈现出类似模式:包含三声或四声的单词更有可能被识别,而包含一声和二声的单词则不然。使用幂函数模型进一步探究了声调在汉语语音识别中的定量作用,并发现其在将音素识别与句子识别联系起来方面发挥着重要作用。

相似文献

1
Importance of tonal envelope cues in Chinese speech recognition.声调包络线索在中国语音识别中的重要性。
J Acoust Soc Am. 1998 Jul;104(1):505-10. doi: 10.1121/1.423251.
2
Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses.影响声调语音感知的刺激特征:对人工耳蜗的启示
J Acoust Soc Am. 2002 Jul;112(1):247-58. doi: 10.1121/1.1487843.
3
Sine-wave speech recognition in a tonal language.语调语言中的正弦波语音识别。
J Acoust Soc Am. 2012 Feb;131(2):EL133-8. doi: 10.1121/1.3670594.
4
The Role of Lexical Tone Information in the Recognition of Mandarin Sentences in Listeners With Hearing Aids.助听者识别汉语句子中词汇声调信息的作用。
Ear Hear. 2020 May/Jun;41(3):532-538. doi: 10.1097/AUD.0000000000000774.
5
Spectral and temporal cues in cochlear implant speech perception.人工耳蜗语音感知中的频谱和时间线索。
Ear Hear. 2006 Apr;27(2):208-17. doi: 10.1097/01.aud.0000202312.31837.25.
6
Cantonese lexical tone recognition from frequency-specific temporal envelope and periodicity components in the same versus different noise band carriers.
Cochlear Implants Int. 2009;10 Suppl 1:148-58. doi: 10.1179/cim.2009.10.Supplement-1.148.
7
Temporal and spectral cues in Mandarin tone recognition.普通话声调识别中的时间和频谱线索。
J Acoust Soc Am. 2006 Nov;120(5 Pt 1):2830-40. doi: 10.1121/1.2346009.
8
Enhancing Chinese tone recognition by manipulating amplitude envelope: implications for cochlear implants.通过操纵幅度包络增强汉语声调识别:对人工耳蜗的启示
J Acoust Soc Am. 2004 Dec;116(6):3659-67. doi: 10.1121/1.1783352.
9
Speech Recognition in Noise in Adults and Children Who Speak English or Chinese as Their First Language.以英语或汉语为母语的成人和儿童在噪声环境中的语音识别
J Am Acad Audiol. 2018 Nov/Dec;29(10):885-897. doi: 10.3766/jaaa.17066.
10
Speech recognition with primarily temporal cues.主要基于时间线索的语音识别。
Science. 1995 Oct 13;270(5234):303-4. doi: 10.1126/science.270.5234.303.

引用本文的文献

1
Mandarin-speaking children with different types of cochlear implant exhibit variations in the activation patterns of their central auditory processing.使用不同类型人工耳蜗的说普通话儿童,其中枢听觉处理的激活模式存在差异。
Front Neurosci. 2024 Dec 16;18:1520415. doi: 10.3389/fnins.2024.1520415. eCollection 2024.
2
Outcomes Using the Optimized Pitch and Language Strategy Versus the Advanced Combination Encoder Strategy in Mandarin-Speaking Cochlear Implant Recipients.在使用普通话的人工耳蜗植入受者中,优化音高和语言策略与先进组合编码器策略的效果比较。
Ear Hear. 2025;46(1):210-222. doi: 10.1097/AUD.0000000000001572. Epub 2024 Aug 6.
3
Tonal language experience facilitates the use of spatial cues for segregating competing speech in bimodal cochlear implant listeners.
声调语言经验有助于双模式人工耳蜗植入者利用空间线索来区分竞争语音。
JASA Express Lett. 2024 Mar 1;4(3). doi: 10.1121/10.0025058.
4
On the definition of noise.论噪声的定义。
Humanit Soc Sci Commun. 2022;9(1):406. doi: 10.1057/s41599-022-01431-x. Epub 2022 Nov 8.
5
Identification of Minimal Pairs of Japanese Pitch Accent in Noise-Vocoded Speech.噪声声码语音中日语声调重音最小对立体的识别
Front Psychol. 2022 May 31;13:887761. doi: 10.3389/fpsyg.2022.887761. eCollection 2022.
6
Differential weighting of temporal envelope cues from the low-frequency region for Mandarin sentence recognition in noise.不同频率区域的时域包络线索对噪声中普通话句子识别的差异加权。
BMC Neurosci. 2022 Jun 13;23(1):35. doi: 10.1186/s12868-022-00721-z.
7
Perception of English Stress of Synthesized Words by Three Chinese Dialect Groups.三个汉语方言群体对合成词英语重音的感知
Front Psychol. 2022 Mar 16;13:803008. doi: 10.3389/fpsyg.2022.803008. eCollection 2022.
8
The common limitations in auditory temporal processing for Mandarin Chinese and Japanese.普通话和日语在听觉时间处理上的常见局限性。
Sci Rep. 2022 Feb 22;12(1):3002. doi: 10.1038/s41598-022-06925-x.
9
A Review of Speech Perception of Mandarin-Speaking Children With Cochlear Implantation.对接受人工耳蜗植入的说普通话儿童的言语感知的综述。
Front Neurosci. 2021 Dec 14;15:773694. doi: 10.3389/fnins.2021.773694. eCollection 2021.
10
Relative Weights of Temporal Envelope Cues in Different Frequency Regions for Mandarin Vowel, Consonant, and Lexical Tone Recognition.汉语元音、辅音和声调识别中不同频率区域的时间包络线索的相对权重
Front Neurosci. 2021 Dec 2;15:744959. doi: 10.3389/fnins.2021.744959. eCollection 2021.