共振峰易于测量，共鸣则不然：克拉特（1986）的经验教训。

Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).

机构信息

Haskins Laboratories, New Haven, Connecticut 06511, USA.

Department of Linguistics, California State University Fresno, Fresno, California 93740, USA.

出版信息

J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.

DOI:10.1121/10.0013410

PMID:36050157

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9374483/

Abstract

Formants in speech signals are easily identified, largely because formants are defined to be local maxima in the wideband sound spectrum. Sadly, this is not what is of most interest in analyzing speech; instead, resonances of the vocal tract are of interest, and they are much harder to measure. Klatt [(1986). in Proceedings of the Montreal Satellite Symposium on Speech Recognition, 12th International Congress on Acoustics, edited by P. Mermelstein (Canadian Acoustical Society, Montreal), pp. 5-7] showed that estimates of resonances are biased by harmonics while the human ear is not. Several analysis techniques placed the formant closer to a strong harmonic than to the center of the resonance. This "harmonic attraction" can persist with newer algorithms and in hand measurements, and systematic errors can persist even in large corpora. Research has shown that the reassigned spectrogram is less subject to these errors than linear predictive coding and similar measures, but it has not been satisfactorily automated, making its wider use unrealistic. Pending better techniques, the recommendations are (1) acknowledge limitations of current analyses regarding influence of F0 and limits on granularity, (2) report settings more fully, (3) justify settings chosen, and (4) examine the pattern of F0 vs F1 for possible harmonic bias.

摘要

语音信号中的共振峰很容易识别，主要是因为共振峰被定义为宽带声音频谱中的局部最大值。不幸的是，这并不是分析语音时最感兴趣的；相反，声道的共振更受关注，而且它们更难测量。Klatt [(1986). 在第 12 届国际声学大会蒙特利尔卫星语音识别研讨会上的演讲，由 P. Mermelstein 编辑（加拿大声学学会，蒙特利尔），第 5-7 页] 表明，共振的估计受到谐波的影响，而人耳不受影响。几种分析技术将共振峰放在比共振中心更强的谐波附近。这种“谐波吸引力”即使在新算法和手动测量中也会持续存在，即使在大型语料库中也会存在系统误差。研究表明，与线性预测编码和类似的测量方法相比，重新分配的声谱图受这些误差的影响较小，但它尚未得到令人满意的自动化，因此其更广泛的应用是不现实的。在更好的技术出现之前，建议（1）承认当前分析在 F0 影响和粒度限制方面的局限性，（2）更全面地报告设置，（3）证明所选择的设置是合理的，以及（4）检查 F0 与 F1 之间的模式，以确定是否存在谐波偏差。

相似文献

Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).共振峰易于测量，共鸣则不然：克拉特（1986）的经验教训。

J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.

Comparing measurement errors for formants in synthetic and natural vowels.比较合成元音和自然元音中元音共振峰的测量误差。

J Acoust Soc Am. 2016 Feb;139(2):713-27. doi: 10.1121/1.4940665.

Assessing accuracy of resonances obtained with reassigned spectrograms from the "ground truth" of physical vocal tract models.根据物理声道模型的“真实情况”评估通过重新分配频谱图获得的共振的准确性。

J Acoust Soc Am. 2024 Feb 1;155(2):1253-1263. doi: 10.1121/10.0024548.

Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction.利用重排语谱图和线性预测对合成元音的共振峰测量的准确性进行比较。

J Acoust Soc Am. 2010 Apr;127(4):2114-7. doi: 10.1121/1.3308476.

F0-induced formant measurement errors result in biased variabilities.F0 诱导的共振峰测量误差导致变异性产生偏差。

J Acoust Soc Am. 2019 May;145(5):EL360. doi: 10.1121/1.5103195.

New Evidence That Nonlinear Source-Filter Coupling Affects Harmonic Intensity and fo Stability During Instances of Harmonics Crossing Formants.非线性源-滤波器耦合在谐波跨越共振峰时影响谐波强度和基频稳定性的新证据。

J Voice. 2017 Mar;31(2):149-156. doi: 10.1016/j.jvoice.2016.04.010. Epub 2016 Aug 5.

Informational masking and the effects of differences in fundamental frequency and fundamental-frequency contour on phonetic integration in a formant ensemble.信息掩蔽以及共振峰组合中基频和基频轮廓差异对语音整合的影响。

Hear Res. 2017 Feb;344:295-303. doi: 10.1016/j.heares.2016.10.026. Epub 2016 Nov 1.

The F1-F2 vowel chart for Czech whispered vowels a, e, i, o, u.捷克语低语元音a、e、i、o、u的F1-F2元音图表。

Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub. 2007 Dec;151(2):353-6. doi: 10.5507/bp.2007.061.

Frequency measurement of vowel formants produced by Brazilian children aged between 4 and 8 years.4至8岁巴西儿童元音共振峰的频率测量。

J Voice. 2015 May;29(3):292-8. doi: 10.1016/j.jvoice.2014.08.001. Epub 2014 Dec 12.

Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization.关于发声中谐波、共振和共振峰符号表示的共识探讨。

J Acoust Soc Am. 2015 May;137(5):3005-7. doi: 10.1121/1.4919349.

引用本文的文献

Formant analysis of vertebrate vocalizations: achievements, pitfalls, and promises.脊椎动物发声的共振峰分析：成就、陷阱与前景。

BMC Biol. 2025 Apr 7;23(1):92. doi: 10.1186/s12915-025-02188-w.

A cross-language speech model for detection of Parkinson's disease.一种用于帕金森病检测的跨语言语音模型。

J Neural Transm (Vienna). 2025 Apr;132(4):579-590. doi: 10.1007/s00702-024-02874-z. Epub 2024 Dec 30.

J Acoust Soc Am. 2024 Feb 1;155(2):1253-1263. doi: 10.1121/10.0024548.

A practical guide to calculating vocal tract length and scale-invariant formant patterns.计算声道长度和标度不变共振峰模式的实用指南。

Behav Res Methods. 2024 Sep;56(6):5588-5604. doi: 10.3758/s13428-023-02288-x. Epub 2023 Dec 29.

An acoustic study of Cantonese alaryngeal speech in different speaking conditions.不同发音条件下的广东话无喉语音声学研究。

J Acoust Soc Am. 2023 May 1;153(5):2973. doi: 10.1121/10.0019471.

Voice efficiency for different voice qualities combining experimentally derived sound signals and numerical modeling of the vocal tract.结合实验得出的声音信号和声道数值模型，研究不同音质的语音效率。

Front Physiol. 2022 Dec 23;13:1081622. doi: 10.3389/fphys.2022.1081622. eCollection 2022.

本文引用的文献

The spectrogram, method of reassignment, and frequency-domain beamforming.频谱图、重分配方法和频域波束形成。

J Acoust Soc Am. 2021 Feb;149(2):747. doi: 10.1121/10.0003384.

Vowel variability and contrast in Childhood Apraxia of Speech: acoustics and articulation.儿童言语失用症中的元音可变性和对比：声学和发音。

Clin Linguist Phon. 2021 Nov 2;35(11):1011-1035. doi: 10.1080/02699206.2020.1853811. Epub 2020 Dec 16.

F0-induced formant measurement errors result in biased variabilities.F0 诱导的共振峰测量误差导致变异性产生偏差。

J Acoust Soc Am. 2019 May;145(5):EL360. doi: 10.1121/1.5103195.

Formant estimation and tracking: A deep learning approach.共振峰估计和跟踪：一种深度学习方法。

J Acoust Soc Am. 2019 Feb;145(2):642. doi: 10.1121/1.5088048.

Vowel Formant Dispersion Reflects Severity of Apraxia of Speech.元音共振峰离散度反映言语失用症的严重程度。

Aphasiology. 2018;32(8):902-921. doi: 10.1080/02687038.2017.1385050. Epub 2017 Oct 2.

Quasi-closed phase forward-backward linear prediction analysis of speech for accurate formant detection and estimation.用于精确共振峰检测与估计的语音准封闭相前后向线性预测分析。

J Acoust Soc Am. 2017 Sep;142(3):1542. doi: 10.1121/1.5001512.

ConceFT: concentration of frequency and time via a multitapered synchrosqueezed transform.ConceFT：通过多窗口同步挤压变换实现频率和时间的集中

Philos Trans A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150193. doi: 10.1098/rsta.2015.0193.

Comparing measurement errors for formants in synthetic and natural vowels.比较合成元音和自然元音中元音共振峰的测量误差。

J Acoust Soc Am. 2016 Feb;139(2):713-27. doi: 10.1121/1.4940665.

Formant measurement in children's speech based on spectral filtering.基于频谱滤波的儿童语音共振峰测量

Speech Commun. 2015;76:93-111. doi: 10.1016/j.specom.2015.11.001.

Hearing impairment and vowel production. A comparison between normally hearing, hearing-aided and cochlear implanted Dutch children.听力障碍与元音发音。正常听力、佩戴助听器和接受人工耳蜗植入的荷兰儿童之间的比较。

J Commun Disord. 2016 Jan-Feb;59:24-39. doi: 10.1016/j.jcomdis.2015.10.007. Epub 2015 Nov 10.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验