Suppr超能文献

根据物理声道模型的“真实情况”评估通过重新分配频谱图获得的共振的准确性。

Assessing accuracy of resonances obtained with reassigned spectrograms from the "ground truth" of physical vocal tract models.

作者信息

Shadle Christine H, Fulop Sean A, Chen Wei-Rong, Whalen D H

机构信息

Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA.

Department of Linguistics, Fresno State University, Fresno, California 93740, USA.

出版信息

J Acoust Soc Am. 2024 Feb 1;155(2):1253-1263. doi: 10.1121/10.0024548.

Abstract

The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). "Comparing measurement errors for formants in synthetic and natural vowels," J. Acoust. Soc. Am. 139(2), 713-727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.

摘要

重新分配的频谱图(RS)已成为从声学信号中推断声道共振的最准确方法[沙德尔、南和惠伦(2016年)。“比较合成元音和自然元音中元音共振峰的测量误差”,《美国声学学会杂志》139(2),713 - 727]。迄今为止,验证其准确性依赖于对这些共振的真实值进行共振峰合成。合成易于控制,但它有许多内在假设,不一定能像物理共振那样准确地实现声学效果。在这里,我们表明具有可推导共振值的声道物理模型允许采用一种不同的方法来确定真实值,且有不同的局限性。我们的三维打印声道模型由白噪声激发,从而能够准确确定共振频率。然后,实现了一系列具有不同基频的声源,从而可以直接评估RS是否避免了其他分析技术容易出现的朝向最近强谐波的系统偏差。在高达300Hz的基频下,RS确实是准确的;高于该频率,准确性会有所降低。未来的方向包括测试具有儿童声道尺寸的机械模型,以及通过自动检测共振使RS更广泛地有用。

相似文献

2
Comparing measurement errors for formants in synthetic and natural vowels.
J Acoust Soc Am. 2016 Feb;139(2):713-27. doi: 10.1121/1.4940665.
3
The Dynamic Effect of the Valleculae on Singing Voice - An Exploratory Study Using 3D Printed Vocal Tracts.
J Voice. 2023 Mar;37(2):178-186. doi: 10.1016/j.jvoice.2020.12.012. Epub 2021 Jan 1.
4
Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).
J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.
5
Vocal Tract and Subglottal Impedance in High Performance Singing: A Case Study.
J Voice. 2024 Sep;38(5):1248.e11-1248.e21. doi: 10.1016/j.jvoice.2022.01.015. Epub 2022 Feb 26.
7
Human vocal tract resonances and the corresponding mode shapes investigated by three-dimensional finite-element modelling based on CT measurement.
Logoped Phoniatr Vocol. 2015 Apr;40(1):14-23. doi: 10.3109/14015439.2013.775333. Epub 2013 Mar 21.
9
Processing group delay spectrograms for study of formant and harmonic contours in speech signals.
J Acoust Soc Am. 2024 Oct 1;156(4):2422-2433. doi: 10.1121/10.0032364.

引用本文的文献

1
Formant analysis of vertebrate vocalizations: achievements, pitfalls, and promises.
BMC Biol. 2025 Apr 7;23(1):92. doi: 10.1186/s12915-025-02188-w.

本文引用的文献

1
Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).
J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.
2
Printable 3D vocal tract shapes from MRI data and their acoustic and aerodynamic properties.
Sci Data. 2020 Aug 5;7(1):255. doi: 10.1038/s41597-020-00597-w.
3
F0-induced formant measurement errors result in biased variabilities.
J Acoust Soc Am. 2019 May;145(5):EL360. doi: 10.1121/1.5103195.
4
How to precisely measure the volume velocity transfer function of physical vocal tract models by external excitation.
PLoS One. 2018 Mar 15;13(3):e0193708. doi: 10.1371/journal.pone.0193708. eCollection 2018.
5
An Acoustic Study of Vowels Produced by Alaryngeal Speakers in Taiwan.
Am J Speech Lang Pathol. 2016 Nov 1;25(4):481-492. doi: 10.1044/2016_AJSLP-15-0068.
6
Comparing measurement errors for formants in synthetic and natural vowels.
J Acoust Soc Am. 2016 Feb;139(2):713-27. doi: 10.1121/1.4940665.
7
A New Reassigned Spectrogram Method in Interference Detection for GNSS Receivers.
Sensors (Basel). 2015 Sep 2;15(9):22167-91. doi: 10.3390/s150922167.
10
Education in acoustics and speech science using vocal-tract models.
J Acoust Soc Am. 2012 Mar;131(3):2444-54. doi: 10.1121/1.3677245.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验