• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

作为元音声学相关物的频谱形状特征与共振峰对比

Spectral-shape features versus formants as acoustic correlates for vowels.

作者信息

Zahorian S A, Jagharghi A J

机构信息

Department of Electrical and Computer Engineering, Old Dominion University, Norfolk, Virginia 23529.

出版信息

J Acoust Soc Am. 1993 Oct;94(4):1966-82. doi: 10.1121/1.407520.

DOI:10.1121/1.407520
PMID:8227741
Abstract

The first three formants, i.e., the first three spectral prominences of the short-time magnitude spectra, have been the most commonly used acoustic cues for vowels ever since the work of Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)]. However, spectral shape features, which encode the global smoothed spectrum, provide a more complete spectral description, and therefore might be even better acoustic correlates for vowels. In this study automatic vowel classification experiments were used to compare formants and spectral-shape features for monopthongal vowels spoken in the context of isolated CVC words, under a variety of conditions. The roles of static and time-varying information for vowel discrimination were also compared. Spectral shape was encoded using the coefficients in a cosine expansion of the nonlinearly scaled magnitude spectrum. Under almost all conditions investigated, in the absence of fundamental frequency (F0) information, automatic vowel classification based on spectral-shape features was superior to that based on formants. If F0 was used as an additional feature, vowel classification based on spectral shape features was still superior to that based on formants, but the differences between the two feature sets were reduced. It was also found that the error pattern of perceptual confusions was more closely correlated with errors in automatic classification obtained from spectral-shape features than with classification errors from formants. Therefore it is concluded that spectral-shape features are a more complete set of acoustic correlates for vowel identity than are formants. In comparing static and time-varying features, static features were the most important for vowel discrimination, but feature trajectories were valuable secondary sources of information.

摘要

自彼得森和巴尼的研究[《美国声学学会杂志》24, 175 - 184 (1952)]以来,前三个共振峰,即短时幅度谱的前三个频谱峰值,一直是元音最常用的声学线索。然而,编码全局平滑频谱的频谱形状特征提供了更完整的频谱描述,因此可能是更好的元音声学相关特征。在本研究中,使用自动元音分类实验,在各种条件下,比较孤立的CVC单词语境中单元音的共振峰和频谱形状特征。还比较了静态和时变信息在元音辨别中的作用。频谱形状通过非线性缩放幅度谱的余弦展开系数进行编码。在几乎所有研究的条件下,在没有基频(F0)信息时,基于频谱形状特征的自动元音分类优于基于共振峰的分类。如果将F0用作附加特征,基于频谱形状特征的元音分类仍优于基于共振峰的分类,但两个特征集之间的差异减小。还发现,感知混淆的错误模式与从频谱形状特征获得的自动分类错误的相关性,比与共振峰分类错误的相关性更紧密。因此得出结论,与共振峰相比,频谱形状特征是更完整的元音身份声学相关特征集。在比较静态和时变特征时,静态特征对元音辨别最为重要,但特征轨迹是有价值的次要信息来源。

相似文献

1
Spectral-shape features versus formants as acoustic correlates for vowels.作为元音声学相关物的频谱形状特征与共振峰对比
J Acoust Soc Am. 1993 Oct;94(4):1966-82. doi: 10.1121/1.407520.
2
Evaluation of formant-like features on an automatic vowel classification task.在自动元音分类任务中对类共振峰特征的评估。
J Acoust Soc Am. 2004 Sep;116(3):1781-92. doi: 10.1121/1.1781620.
3
A perceptual model of vowel recognition based on the auditory representation of American English vowels.一种基于美式英语元音听觉表征的元音识别感知模型。
J Acoust Soc Am. 1986 Apr;79(4):1086-100. doi: 10.1121/1.393381.
4
Identification of steady-state vowels synthesized from the Peterson and Barney measurements.从彼得森和巴尼的测量数据中合成的稳态元音的识别。
J Acoust Soc Am. 1993 Aug;94(2 Pt 1):668-74. doi: 10.1121/1.406884.
5
Speaker normalization of static and dynamic vowel spectral features.
J Acoust Soc Am. 1991 Jul;90(1):67-75. doi: 10.1121/1.402350.
6
Static features in real-time recognition of isolated vowels at high pitch.高音调孤立元音实时识别中的静态特征
J Acoust Soc Am. 2007 Oct;122(4):2389-404. doi: 10.1121/1.2772228.
7
Perceptual separation of simultaneous vowels: within and across-formant grouping by F0.同时发出的元音的感知分离:通过基频进行共振峰内部和跨共振峰分组
J Acoust Soc Am. 1993 Jun;93(6):3454-67. doi: 10.1121/1.405675.
8
The relative importance of spectral tilt in monophthongs and diphthongs.单元音和双元音中频谱倾斜的相对重要性。
J Acoust Soc Am. 2005 Mar;117(3 Pt 1):1395-404. doi: 10.1121/1.1861158.
9
Fundamental frequency effects on thresholds for vowel formant discrimination.基频对元音共振峰辨别阈值的影响。
J Acoust Soc Am. 1996 Oct;100(4 Pt 1):2462-70. doi: 10.1121/1.417954.
10
On the sufficiency of compound target specification of isolated vowels and vowels in /bVb/ syllables.关于孤立元音及/bVb/音节中元音复合目标规范的充分性
J Acoust Soc Am. 1992 Jan;91(1):390-410. doi: 10.1121/1.402781.

引用本文的文献

1
Receptive-field nonlinearities in primary auditory cortex: a comparative perspective.初级听觉皮层中的感受野非线性:比较视角。
Cereb Cortex. 2024 Sep 3;34(9). doi: 10.1093/cercor/bhae364.
2
Adaptation to Noise in Spectrotemporal Modulation Detection and Word Recognition.声谱时变调制检测和单词识别中的噪声适应。
Trends Hear. 2024 Jan-Dec;28:23312165241266322. doi: 10.1177/23312165241266322.
3
Improving the workflow to crack Small, Unbalanced, Noisy, but Genuine (SUNG) datasets in bioacoustics: The case of bonobo calls.
改进生物声学中小、不平衡、嘈杂但真实(SUNG)数据集的工作流程:以倭黑猩猩叫声为例。
PLoS Comput Biol. 2023 Apr 13;19(4):e1010325. doi: 10.1371/journal.pcbi.1010325. eCollection 2023 Apr.
4
Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input.早期语音学习无需语音类别:基于真实输入的大规模模拟研究的启示。
Proc Natl Acad Sci U S A. 2021 Feb 9;118(7). doi: 10.1073/pnas.2001844118.
5
What Acoustic Studies Tell Us About Vowels in Developing and Disordered Speech.关于发展中的和障碍性言语中的元音的声学研究告诉了我们什么。
Am J Speech Lang Pathol. 2020 Aug 4;29(3):1749-1778. doi: 10.1044/2020_AJSLP-19-00178. Epub 2020 Jul 6.
6
Perception of local and non-local vowels by adults and children in the South.南方成年人和儿童对本地和非本地元音的感知。
J Acoust Soc Am. 2020 Jan;147(1):627. doi: 10.1121/10.0000542.
7
Static measurements of vowel formant frequencies and bandwidths: A review.元音共振峰频率和带宽的静态测量:综述。
J Commun Disord. 2018 Jul-Aug;74:74-97. doi: 10.1016/j.jcomdis.2018.05.004. Epub 2018 Jun 1.
8
Contribution of formant frequency information to vowel perception in steady-state noise by cochlear implant users.耳蜗植入使用者在稳态噪声中元音感知时共振峰频率信息的作用。
J Acoust Soc Am. 2017 Feb;141(2):1027. doi: 10.1121/1.4976059.
9
The Hearing-Aid Audio Quality Index (HAAQI).助听器音频质量指数(HAAQI)。
IEEE/ACM Trans Audio Speech Lang Process. 2016 Feb;24(2):354-365. doi: 10.1109/TASLP.2015.2507858. Epub 2015 Dec 10.
10
The role of spectral cues in timbre discrimination by ferrets and humans.频谱线索在雪貂和人类音色辨别中的作用。
J Acoust Soc Am. 2015 May;137(5):2870-83. doi: 10.1121/1.4916690.