Haskins Laboratories, 300 George Street, New Haven, Connecticut 06511, USA.
J Acoust Soc Am. 2013 Sep;134(3):2235-46. doi: 10.1121/1.4816491.
While efforts to document endangered languages have steadily increased, the phonetic analysis of endangered language data remains a challenge. The transcription of large documentation corpora is, by itself, a tremendous feat. Yet, the process of segmentation remains a bottleneck for research with data of this kind. This paper examines whether a speech processing tool, forced alignment, can facilitate the segmentation task for small data sets, even when the target language differs from the training language. The authors also examined whether a phone set with contextualization outperforms a more general one. The accuracy of two forced aligners trained on English (hmalign and p2fa) was assessed using corpus data from Yoloxóchitl Mixtec. Overall, agreement performance was relatively good, with accuracy at 70.9% within 30 ms for hmalign and 65.7% within 30 ms for p2fa. Segmental and tonal categories influenced accuracy as well. For instance, additional stop allophones in hmalign's phone set aided alignment accuracy. Agreement differences between aligners also corresponded closely with the types of data on which the aligners were trained. Overall, using existing alignment systems was found to have potential for making phonetic analysis of small corpora more efficient, with more allophonic phone sets providing better agreement than general ones.
虽然记录濒危语言的努力一直在稳步增加,但对濒危语言数据的语音分析仍然是一个挑战。对大型文献语料库进行转写本身就是一项艰巨的任务。然而,对于这种类型的数据,分割过程仍然是研究的瓶颈。本文探讨了语音处理工具强制对齐是否可以促进小数据集的分割任务,即使目标语言与训练语言不同。作者还研究了具有语境化的音位集是否优于更通用的音位集。作者使用来自 Yoloxóchitl Mixtec 的语料库数据评估了针对英语的两种强制对齐器(hmalign 和 p2fa)的准确性。总体而言,协议性能相对较好,hmalign 的准确率为 70.9%,在 30ms 内,p2fa 的准确率为 65.7%。分段和声调类别也会影响准确性。例如,hmalign 音位集中的额外停止音位有助于提高对齐准确性。对齐器之间的协议差异也与对齐器所训练的数据类型密切相关。总的来说,使用现有的对齐系统有可能提高对小语料库的语音分析效率,使用更多的变音位集比使用通用音位集可以获得更好的协议。