一种基于声学特征估计元音鼻化程度时变特性的实用方法。

A practical method of estimating the time-varying degree of vowel nasalization from acoustic features.

机构信息

Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom.

出版信息

J Acoust Soc Am. 2021 Feb;149(2):911. doi: 10.1121/10.0002925.

Abstract

This paper presents a simple and easy-to-use method of creating a time-varying signal of the degree of nasalization in vowels, generated from acoustic features measured in oral and nasalized vowel contexts. The method is presented for separate models constructed using two sets of acoustic features: (1) an uninformed set of 13 Mel-frequency cepstral coefficients (MFCCs) and (2) a combination of the 13 MFCCs and a phonetically informed set of 20 acoustic features of vowel nasality derived from previous research. Both models are compared against two traditional approaches to estimating vowel nasalization from acoustics: A1-P0 and A1-P1, as well as their formant-compensated counterparts. Data include productions from six speakers of different language backgrounds, producing 11 different qualities within the vowel quadrilateral. The results generated from each of the methods are compared against nasometric measurements, representing an objective "ground truth" of the degree of nasalization. The results suggest that the proposed method is more robust than conventional acoustic approaches, generating signals which correlate strongly with nasometric measures across all vowel qualities and all speakers and accurately approximate the time-varying change in the degree of nasalization. Finally, an experimental example is provided to help researchers implement the method in their own study designs.

摘要

本文提出了一种简单易用的方法，用于创建元音鼻音度的时变信号，该信号由口腔和鼻音元音环境中测量的声学特征生成。该方法针对使用两组声学特征分别构建的模型提出：（1）一组未通知的 13 个梅尔频率倒谱系数（MFCC）和（2）13 个 MFCC 与先前研究中得出的 20 个元音鼻音声学特征的组合。这两个模型都与估计元音鼻音的两种传统方法进行了比较：A1-P0 和 A1-P1 以及它们的共振峰补偿对应物。数据包括来自不同语言背景的六位发音者的发音，在元音四边形内产生了 11 种不同的质量。每种方法生成的结果都与鼻音计测量值进行了比较，代表了鼻音程度的客观“真实值”。结果表明，与传统声学方法相比，所提出的方法更稳健，生成的信号与所有元音质量和所有发音者的鼻音计测量值相关性很强，并且准确地近似于鼻音程度的时变变化。最后，提供了一个实验示例，以帮助研究人员在自己的研究设计中实施该方法。