Oxford University Phonetics Laboratory, OX1 2JF Oxford, United Kingdom.
J Acoust Soc Am. 2011 May;129(5):3258-70. doi: 10.1121/1.3559709.
Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.
通过将 15 种先前发表的节奏测量方法应用于来自五种语言的大量语音语料库,研究了时长变化模式。为了在所有语言中实现一致的分割,开发了一个自动语音识别系统,将波形划分为辅音和元音区域。所得的时长测量值严格基于声学标准。机器分类表明,节奏测量值可以以高于机会的速度分离语言。然而,语言内节奏测量值的可变性很大,与语言之间的可变性相当。因此,不能仅从单个段落可靠地识别不同的语言。在将两种语言分开的实验中,一种在分离一对语言方面相对成功的节奏度量值在另一对语言上的表现往往很差:没有广泛成功的节奏度量值。同时分离所有五种语言需要三种节奏度量值的组合。许多三胞胎的效果大致相同,但语言之间的混淆模式随节奏度量值的选择而变化。