Suppr超能文献

共振峰易于测量,共鸣则不然:克拉特(1986)的经验教训。

Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).

机构信息

Haskins Laboratories, New Haven, Connecticut 06511, USA.

Department of Linguistics, California State University Fresno, Fresno, California 93740, USA.

出版信息

J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.

Abstract

Formants in speech signals are easily identified, largely because formants are defined to be local maxima in the wideband sound spectrum. Sadly, this is not what is of most interest in analyzing speech; instead, resonances of the vocal tract are of interest, and they are much harder to measure. Klatt [(1986). in Proceedings of the Montreal Satellite Symposium on Speech Recognition, 12th International Congress on Acoustics, edited by P. Mermelstein (Canadian Acoustical Society, Montreal), pp. 5-7] showed that estimates of resonances are biased by harmonics while the human ear is not. Several analysis techniques placed the formant closer to a strong harmonic than to the center of the resonance. This "harmonic attraction" can persist with newer algorithms and in hand measurements, and systematic errors can persist even in large corpora. Research has shown that the reassigned spectrogram is less subject to these errors than linear predictive coding and similar measures, but it has not been satisfactorily automated, making its wider use unrealistic. Pending better techniques, the recommendations are (1) acknowledge limitations of current analyses regarding influence of F0 and limits on granularity, (2) report settings more fully, (3) justify settings chosen, and (4) examine the pattern of F0 vs F1 for possible harmonic bias.

摘要

语音信号中的共振峰很容易识别,主要是因为共振峰被定义为宽带声音频谱中的局部最大值。不幸的是,这并不是分析语音时最感兴趣的;相反,声道的共振更受关注,而且它们更难测量。Klatt [(1986). 在第 12 届国际声学大会蒙特利尔卫星语音识别研讨会上的演讲,由 P. Mermelstein 编辑(加拿大声学学会,蒙特利尔),第 5-7 页] 表明,共振的估计受到谐波的影响,而人耳不受影响。几种分析技术将共振峰放在比共振中心更强的谐波附近。这种“谐波吸引力”即使在新算法和手动测量中也会持续存在,即使在大型语料库中也会存在系统误差。研究表明,与线性预测编码和类似的测量方法相比,重新分配的声谱图受这些误差的影响较小,但它尚未得到令人满意的自动化,因此其更广泛的应用是不现实的。在更好的技术出现之前,建议(1)承认当前分析在 F0 影响和粒度限制方面的局限性,(2)更全面地报告设置,(3)证明所选择的设置是合理的,以及(4)检查 F0 与 F1 之间的模式,以确定是否存在谐波偏差。

相似文献

8
The F1-F2 vowel chart for Czech whispered vowels a, e, i, o, u.捷克语低语元音a、e、i、o、u的F1-F2元音图表。
Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub. 2007 Dec;151(2):353-6. doi: 10.5507/bp.2007.061.

引用本文的文献

2
A cross-language speech model for detection of Parkinson's disease.一种用于帕金森病检测的跨语言语音模型。
J Neural Transm (Vienna). 2025 Apr;132(4):579-590. doi: 10.1007/s00702-024-02874-z. Epub 2024 Dec 30.

本文引用的文献

5
Vowel Formant Dispersion Reflects Severity of Apraxia of Speech.元音共振峰离散度反映言语失用症的严重程度。
Aphasiology. 2018;32(8):902-921. doi: 10.1080/02687038.2017.1385050. Epub 2017 Oct 2.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验