连续语音中声源与声道同步的检测。

Detection of synchronization of the voice source and vocal tract in connected speech.

作者信息

Story Brad H, Maxfield Lynn, Palaparthi Anil, Ferguson Sarah Hargus, Titze Ingo

机构信息

Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85721, USA.

Utah Center for Vocology, University of Utah, Salt Lake City, Utah 84112, USA.

出版信息

J Acoust Soc Am. 2025 Sep 1;158(3):2207-2224. doi: 10.1121/10.0039348.

DOI:10.1121/10.0039348

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12494142/

Abstract

The purpose of this study was to investigate the degree to which the coupling between the oscillating sound source and the vocal tract filter occurs in connected speech samples, and to provide insight into how humans may choose to deploy this coupling for intelligibility, intensity, or both. A technique was developed to extract, from minutes-long speech samples, the time-dependent fundamental frequency (fo) and the first two formant frequencies (F1 and F2) to permit an analysis that determines whether a talker aligns a voice source harmonic with a vocal tract resonance, and also measures a normalized vowel space area. The accuracy of the processing method was validated by applying it to a set of audio samples generated via speech simulation that provided "ground-truth" data. It was then applied to a 41-talker database of clear and conversational speech. Results indicated that talkers make adjustments for different speaking styles that include not only increased vowel space area but also alignment of harmonics and formant frequencies, although future work is needed to determine whether these adjustments are directed toward maximizing transfer of information or transfer of acoustic power.

摘要

本研究的目的是调查在连贯语音样本中振荡声源与声道滤波器之间耦合发生的程度，并深入了解人类如何选择利用这种耦合来提高可懂度、强度或两者兼顾。开发了一种技术，从长达数分钟的语音样本中提取随时间变化的基频（fo）和前两个共振峰频率（F1和F2），以便进行分析，确定说话者是否将声源谐波与声道共振对齐，同时测量归一化元音空间面积。通过将该处理方法应用于一组通过语音模拟生成的音频样本（提供“真实”数据），验证了该处理方法的准确性。然后将其应用于一个包含41名说话者的清晰对话语音数据库。结果表明，说话者会针对不同的说话风格进行调整，不仅包括增加元音空间面积，还包括谐波与共振峰频率的对齐，不过仍需进一步研究来确定这些调整是否旨在最大化信息传递或声功率传递。

相似文献

1

Detection of synchronization of the voice source and vocal tract in connected speech.连续语音中声源与声道同步的检测。

J Acoust Soc Am. 2025 Sep 1;158(3):2207-2224. doi: 10.1121/10.0039348.

2

Voice disorder discrimination using vowel acoustic measures in female speakers.基于元音声学特征的女性嗓音障碍判别。

Int J Lang Commun Disord. 2024 Sep-Oct;59(5):2087-2102. doi: 10.1111/1460-6984.13081. Epub 2024 Jun 17.

3

An Observational Study of Discourse Tasks and Running Speech Sampling in the Assessment of Paediatric Voice Quality.一项关于话语任务和连续言语采样在儿科嗓音质量评估中的观察性研究。

Int J Lang Commun Disord. 2025 Nov-Dec;60(6):e70132. doi: 10.1111/1460-6984.70132.

4

Vocal tract contribution to vocal intensity: Interaction between vocal fold adduction, formant tuning, and fundamental frequency.声道对发声强度的贡献：声带内收、共振峰调谐和基频之间的相互作用。

J Acoust Soc Am. 2025 Sep 1;158(3):1904-1913. doi: 10.1121/10.0039239.

5

Differences of Electroglottographical Contact Quotients between Connected Speech and Sustained Phonation in Clinical Measurement of Voice.嗓音临床测量中连贯言语与持续发声的电声门图接触商差异

J Voice. 2023 Mar 18. doi: 10.1016/j.jvoice.2023.02.020.

6

Mid Forehead Brow Lift额中眉提升术

7

Smartphone Recordings are Comparable to "Gold Standard" Recordings for Acoustic Measurements of Voice.智能手机录音在嗓音声学测量方面可与“金标准”录音相媲美。

J Voice. 2023 Apr 3. doi: 10.1016/j.jvoice.2023.01.031.

8

Auditory-Perceptual Evaluation of Situationally-Bound Judgements of Listener Comfort for Postlaryngectomy Voice and Speech.喉切除术后嗓音和言语情境性听觉舒适度判断的听觉感知评估

Int J Lang Commun Disord. 2025 Sep-Oct;60(5):e70114. doi: 10.1111/1460-6984.70114.

9

Effectiveness of voice rehabilitation on vocalisation in postlaryngectomy patients: a systematic review.喉切除术后患者的嗓音康复对发声效果的影响：系统评价。

Int J Evid Based Healthc. 2010 Dec;8(4):256-8. doi: 10.1111/j.1744-1609.2010.00177.x.

10

Vesicoureteral Reflux膀胱输尿管反流

本文引用的文献

1

Computer simulation of vocal tract resonance tuning strategies with respect to fundamental frequency and voice source spectral slope in singing.关于歌唱中基频和嗓音源频谱斜率的声道共振调谐策略的计算机模拟

J Acoust Soc Am. 2022 Dec;152(6):3548. doi: 10.1121/10.0014421.

2

Formants are easy to measure; resonances, not so much: Lessons from Klatt (1986).共振峰易于测量，共鸣则不然：克拉特（1986）的经验教训。

J Acoust Soc Am. 2022 Aug;152(2):933. doi: 10.1121/10.0013410.

3

Are source-filter interactions detectable in classical singing during vowel glides?在母音滑音中，能否检测到源滤波器相互作用在古典歌唱中？

J Acoust Soc Am. 2021 Jun;149(6):4565. doi: 10.1121/10.0005432.

4

A model of speech production based on the acoustic relativity of the vocal tract.基于声道声学相对性的言语产生模型。

J Acoust Soc Am. 2019 Oct;146(4):2522. doi: 10.1121/1.5127756.

5

F0-induced formant measurement errors result in biased variabilities.F0 诱导的共振峰测量误差导致变异性产生偏差。

J Acoust Soc Am. 2019 May;145(5):EL360. doi: 10.1121/1.5103195.

6

Intelligibility of Long-Distance Emergency Calling.远距离紧急呼叫的可懂度。

J Voice. 2020 Jan;34(1):44-52. doi: 10.1016/j.jvoice.2018.08.008. Epub 2018 Sep 15.

7

An age-dependent vocal tract model for males and females based on anatomic measurements.基于解剖学测量的男性和女性依赖年龄的声道模型。

J Acoust Soc Am. 2018 May;143(5):3079. doi: 10.1121/1.5038264.

8

Talker Differences in Clear and Conversational Speech: Perceived Sentence Clarity for Young Adults With Normal Hearing and Older Adults With Hearing Loss.清晰言语与对话言语中的说话者差异：正常听力的年轻人和听力损失老年人的句子清晰度感知

J Speech Lang Hear Res. 2018 Jan 22;61(1):159-173. doi: 10.1044/2017_JSLHR-H-17-0082.

9

Vowel space density as an indicator of speech performance.元音空间密度作为言语表现的一个指标。

J Acoust Soc Am. 2017 May;141(5):EL458. doi: 10.1121/1.4983342.

10

The role of vocal tract and subglottal resonances in producing vocal instabilities.声道和声门下共振在产生嗓音不稳定方面的作用。

J Acoust Soc Am. 2017 Mar;141(3):1546. doi: 10.1121/1.4976954.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验