基于频谱滤波的儿童语音共振峰测量

Formant measurement in children's speech based on spectral filtering.

作者信息

Story Brad H, Bunton Kate

机构信息

Speech Acoustics Laboratory, Department of Speech, Language, and Hearing Sciences, University of Arizona, P.O. Box 210071, Tucson, AZ 85721.

出版信息

Speech Commun. 2015;76:93-111. doi: 10.1016/j.specom.2015.11.001.

DOI:10.1016/j.specom.2015.11.001

PMID:26855461

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4743040/

Abstract

Children's speech presents a challenging problem for formant frequency measurement. In part, this is because high fundamental frequencies, typical of a children's speech production, generate widely spaced harmonic components that may undersample the spectral shape of the vocal tract transfer function. In addition, there is often a weakening of upper harmonic energy and a noise component due to glottal turbulence. The purpose of this study was to develop a formant measurement technique based on cepstral analysis that does not require modification of the cepstrum itself or transformation back to the spectral domain. Instead, a narrow-band spectrum is low-pass filtered with a cutoff point (i.e., cutoff "quefrency" in the terminology of cepstral analysis) to preserve only the spectral envelope. To test the method, speech representative of a 2-3 year-old child was simulated with an airway modulation model of speech production. The model, which includes physiologically-scaled vocal folds and vocal tract, generates sound output analogous to a microphone signal. The vocal tract resonance frequencies can be calculated independently of the output signal and thus provide test cases that allow for assessing the accuracy of the formant tracking algorithm. When applied to the simulated child-like speech, the spectral filtering approach was shown to provide a clear spectrographic representation of formant change over the time course of the signal, and facilitates tracking formant frequencies for further analysis.

摘要

儿童语音的共振峰频率测量是一个具有挑战性的问题。部分原因在于，儿童语音中常见的高基频会产生间隔较宽的谐波成分，这可能会对声道传递函数的频谱形状进行欠采样。此外，由于声门湍流，通常还存在较高谐波能量的减弱以及噪声成分。本研究的目的是开发一种基于倒谱分析的共振峰测量技术，该技术无需对倒谱本身进行修改或转换回频域。相反，通过用截止点（即倒谱分析术语中的截止“伪频率”）对窄带频谱进行低通滤波，仅保留频谱包络。为了测试该方法，使用语音产生的气道调制模型模拟了代表2至3岁儿童的语音。该模型包括生理尺度的声带和声道，产生类似于麦克风信号的声音输出。声道共振频率可以独立于输出信号进行计算，从而提供测试案例，用于评估共振峰跟踪算法的准确性。当应用于模拟的儿童般语音时，频谱滤波方法能够在信号的时间过程中清晰地呈现共振峰变化的频谱图，并有助于跟踪共振峰频率以进行进一步分析。

相似文献

Formant measurement in children's speech based on spectral filtering.

Speech Commun. 2015;76:93-111. doi: 10.1016/j.specom.2015.11.001.

Human Frequency Following Responses to Filtered Speech.

Ear Hear. 2021 Jan/Feb;42(1):87-105. doi: 10.1097/AUD.0000000000000902.

Human Frequency Following Responses to Vocoded Speech.

Ear Hear. 2017 Sep/Oct;38(5):e256-e267. doi: 10.1097/AUD.0000000000000432.

Spectrographic and Electroglottographic Findings of Religious Vocal Performers in Düzce Province of Turkey.

J Voice. 2018 Jan;32(1):127.e25-127.e35. doi: 10.1016/j.jvoice.2017.03.007. Epub 2017 May 11.

Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition.

J Acoust Soc Am. 1996 Jul;100(1):603-14. doi: 10.1121/1.415961.

Acoustics of children's speech: developmental changes of temporal and spectral parameters.

J Acoust Soc Am. 1999 Mar;105(3):1455-68. doi: 10.1121/1.426686.

Noise estimation in voice signals using short-term cepstral analysis.

J Acoust Soc Am. 2007 Mar;121(3):1679-90. doi: 10.1121/1.2427123.

Formant frequency analysis of children's spoken and sung vowels using sweeping fundamental frequency production.

J Voice. 1999 Dec;13(4):570-82. doi: 10.1016/s0892-1997(99)80011-3.

A statistical, formant-pattern model for segregating vowel type and vocal-tract length in developmental formant data.

J Acoust Soc Am. 2009 Apr;125(4):2374-86. doi: 10.1121/1.3079772.

Audio-vocal responses of vocal fundamental frequency and formant during sustained vowel vocalizations in different noises.

Hear Res. 2015 Jun;324:1-6. doi: 10.1016/j.heares.2015.02.005. Epub 2015 Mar 5.

引用本文的文献

Conducting high-quality and reliable acoustic analysis: A tutorial focused on training research assistants.

J Acoust Soc Am. 2024 Apr 1;155(4):2603-2611. doi: 10.1121/10.0025536.

Relating Acoustic Measures to Listener Ratings of Children's Productions of Word-Initial /ɹ/ and /w/.

J Speech Lang Hear Res. 2023 Sep 13;66(9):3413-3427. doi: 10.1044/2023_JSLHR-22-00713. Epub 2023 Aug 17.

Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).

J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.

Vocal plasticity in harbour seal pups.

Philos Trans R Soc Lond B Biol Sci. 2021 Dec 20;376(1840):20200456. doi: 10.1098/rstb.2020.0456. Epub 2021 Nov 1.

Dark tone quality and vocal tract shaping in soprano song production: Insights from real-time MRI.

JASA Express Lett. 2021 Jul;1(7):075202. doi: 10.1121/10.0005109. Epub 2021 Jul 9.

Does Early Phonetic Differentiation Predict Later Phonetic Development? Evidence From a Longitudinal Study of /ɹ/ Development in Preschool Children.

J Speech Lang Hear Res. 2021 Jul 16;64(7):2417-2437. doi: 10.1044/2021_JSLHR-20-00555. Epub 2021 May 31.

What Acoustic Studies Tell Us About Vowels in Developing and Disordered Speech.

Am J Speech Lang Pathol. 2020 Aug 4;29(3):1749-1778. doi: 10.1044/2020_AJSLP-19-00178. Epub 2020 Jul 6.

Computing low-dimensional representations of speech from socio-auditory structures for phonetic analyses.

J Phon. 2018 Nov;71:355-375. Epub 2018 Oct 24.

Static measurements of vowel formant frequencies and bandwidths: A review.

J Commun Disord. 2018 Jul-Aug;74:74-97. doi: 10.1016/j.jcomdis.2018.05.004. Epub 2018 Jun 1.

本文引用的文献

Arizona Child Acoustic Database Repository.

Folia Phoniatr Logop. 2016;68(3):107-111. doi: 10.1159/000452128. Epub 2016 Oct 27.

Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization.

J Acoust Soc Am. 2015 May;137(5):3005-7. doi: 10.1121/1.4919349.

Formant frequency estimation of high-pitched vowels using weighted linear prediction.

J Acoust Soc Am. 2013 Aug;134(2):1295-313. doi: 10.1121/1.4812756.

Phrase-level speech simulation with an airway modulation model of speech production.

Comput Speech Lang. 2013 Jun 1;27(4):989-1010. doi: 10.1016/j.csl.2012.10.005.

Distinct developmental profiles in typical speech acquisition.

J Neurophysiol. 2012 May;107(10):2885-900. doi: 10.1152/jn.00337.2010. Epub 2012 Feb 22.

An anatomically based, time-domain acoustic model of the subglottal system for speech production.

J Acoust Soc Am. 2011 Mar;129(3):1531-47. doi: 10.1121/1.3543971.

Anatomic development of the oral and pharyngeal portions of the vocal tract: an imaging study.

J Acoust Soc Am. 2009 Mar;125(3):1666-78. doi: 10.1121/1.3075589.

Nonlinear source-filter coupling in phonation: theory.

J Acoust Soc Am. 2008 May;123(5):2733-49. doi: 10.1121/1.2832337.

Vowel acoustic space development in children: a synthesis of acoustic and anatomic data.

J Speech Lang Hear Res. 2007 Dec;50(6):1510-45. doi: 10.1044/1092-4388(2007/104).

A comparison of vocal tract perturbation patterns based on statistical and acoustic considerations.

J Acoust Soc Am. 2007 Oct;122(4):EL107-14. doi: 10.1121/1.2771369.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于频谱滤波的儿童语音共振峰测量

Formant measurement in children's speech based on spectral filtering.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献