Hadjipantelis P Z, Aston J A D, Müller H G, Evans J P
J Am Stat Assoc. 2015 Apr 3;110(510):545-559. doi: 10.1080/01621459.2015.1006729. Epub 2015 Jul 6.
Mandarin Chinese is characterized by being a tonal language; the pitch (or ) of its utterances carries considerable linguistic information. However, speech samples from different individuals are subject to changes in amplitude and phase, which must be accounted for in any analysis that attempts to provide a linguistically meaningful description of the language. A joint model for amplitude, phase, and duration is presented, which combines elements from functional data analysis, compositional data analysis, and linear mixed effects models. By decomposing functions via a functional principal component analysis, and connecting registration functions to compositional data analysis, a joint multivariate mixed effect model can be formulated, which gives insights into the relationship between the different modes of variation as well as their dependence on linguistic and nonlinguistic covariates. The model is applied to the COSPRO-1 dataset, a comprehensive database of spoken Taiwanese Mandarin, containing approximately 50,000 phonetically diverse sample contours (syllables), and reveals that phonetic information is jointly carried by both amplitude and phase variation. Supplementary materials for this article are available online.
汉语普通话的特点是有声调语言;其话语的音高(或声调)承载着大量的语言信息。然而,来自不同个体的语音样本会受到幅度和相位变化的影响,在任何试图对该语言进行具有语言学意义描述的分析中都必须考虑到这些因素。本文提出了一个关于幅度、相位和时长的联合模型,该模型结合了功能数据分析、成分数据分析和线性混合效应模型的元素。通过功能主成分分析分解函数,并将配准函数与成分数据分析相联系,可以构建一个联合多元混合效应模型,该模型能够深入了解不同变化模式之间的关系以及它们对语言和非语言协变量的依赖性。该模型应用于COSPRO - 1数据集,这是一个全面的台湾普通话口语数据库,包含约50,000个语音多样的样本轮廓(音节),结果表明语音信息是由幅度和相位变化共同承载的。本文的补充材料可在网上获取。