重新合成的/hVd/语音的识别：共振峰轮廓的影响

Identification of resynthesized /hVd/ utterances: effects of formant contour.

作者信息

Hillenbrand J M, Nearey T M

机构信息

Department of Speech Pathology and Audiology, Western Michigan University, Kalamazoo 49008, USA.

出版信息

J Acoust Soc Am. 1999 Jun;105(6):3509-23. doi: 10.1121/1.424676.

DOI:10.1121/1.424676

PMID:10380673

Abstract

The purpose of this study was to examine the role of formant frequency movements in vowel recognition. Measurements of vowel duration, fundamental frequency, and formant contours were taken from a database of acoustic measurements of 1668 /hVd/ utterances spoken by 45 men, 48 women, and 46 children [Hillenbrand et al., J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. A 300-utterance subset was selected from this database, representing equal numbers of 12 vowels and approximately equal numbers of tokens produced by men, women, and children. Listeners were asked to identify the original, naturally produced signals and two formant-synthesized versions. One set of "original formant" (OF) synthetic signals was generated using the measured formant contours, and a second set of "flat formant" (FF) signals was synthesized with formant frequencies fixed at the values measured at the steadiest portion of the vowel. Results included: (a) the OF synthetic signals were identified with substantially greater accuracy than the FF signals; and (b) the naturally produced signals were identified with greater accuracy than the OF synthetic signals. Pattern recognition results showed that a simple approach to vowel specification based on duration, steady-state F0, and formant frequency measurements at 20% and 80% of vowel duration accounts for much but by no means all of the variation in listeners' labeling of the three types of stimuli.

摘要

本研究的目的是考察共振峰频率变化在元音识别中的作用。元音时长、基频和共振峰轮廓的测量数据取自一个声学测量数据库，该数据库包含45名男性、48名女性和46名儿童说出的1668个/hVd/话语 [希伦布兰德等人，《美国声学学会杂志》97, 3099 - 3111 (1995)]。从该数据库中选取了一个包含300个话语的子集，其中12个元音数量相等，并且男性、女性和儿童说出的话语样本数量大致相等。要求听众识别原始的自然产生的信号以及两个共振峰合成版本。一组“原始共振峰”(OF)合成信号是使用测量得到的共振峰轮廓生成的，另一组“平坦共振峰”(FF)信号是通过将共振峰频率固定在元音最稳定部分测量得到的值来合成的。结果包括：(a) OF合成信号的识别准确率显著高于FF信号；(b) 自然产生的信号的识别准确率高于OF合成信号。模式识别结果表明，一种基于时长、稳态F0以及元音时长20%和80%处的共振峰频率测量的简单元音指定方法，能够解释听众对这三种类型刺激的标注中大部分但绝非全部的变化。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

重新合成的/hVd/语音的识别：共振峰轮廓的影响

Identification of resynthesized /hVd/ utterances: effects of formant contour.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

重新合成的/hVd/语音的识别：共振峰轮廓的影响

Identification of resynthesized /hVd/ utterances: effects of formant contour.

作者信息

机构信息

出版信息

相似文献

引用本文的文献