Suppr超能文献

新英语的自动对齐:将最先进的对齐工具应用于特立尼达英语。

Automatic alignment for New Englishes: Applying state-of-the-art aligners to Trinidadian English.

作者信息

Meer Philipp

机构信息

Chair of English Linguistics, English Department, University of Münster, Johannisstraße 12-20, 48143, Münster, Germany.

出版信息

J Acoust Soc Am. 2020 Apr;147(4):2283. doi: 10.1121/10.0001069.

Abstract

While forced alignment has become an essential part of data processing in phonetic research, state-of-the-art aligners are often exclusively tailor-made for majority dialects, such as American English(es). This paper provides the first in-depth investigation into the reliability of popular pre-trained aligners in New Englishes-the nativized, postcolonial Englishes spoken world-wide. Using manually aligned data from Trinidadian English, the paper examines popular aligners [Forced Alignment and Vowel Extraction (FAVE), Munich Automatic Segmentation (MAUS), and the Montreal Forced Aligner (MFA)] and their performances in automatically segmenting Trinidadian speech. Results show that, first, only specific aligners (FAVE and MFA) can provide alignment that is comparable to that in the training varieties and, to a smaller degree, general human inter-rater uncertainty. Second, even well-performing aligners introduce bias toward their training varieties: the aligners systematically produce more erroneous alignments of Trinidadian English-specific vowels, for which they have no acoustic models. The findings suggest that phonetic research on New Englishes can benefit from pre-trained, state-of-the-art aligners, but that further manual data processing may generally be required to minimize errors in the analysis of non-majority dialect data.

摘要

虽然强制对齐已成为语音研究中数据处理的重要组成部分,但最先进的对齐工具通常是专门为多数方言量身定制的,比如美式英语。本文首次深入研究了流行的预训练对齐工具在新英语(世界各地本土化的后殖民英语)中的可靠性。本文使用来自特立尼达英语的人工对齐数据,检验了流行的对齐工具[强制对齐与元音提取(FAVE)、慕尼黑自动切分(MAUS)和蒙特利尔强制对齐工具(MFA)]及其在自动切分特立尼达语音方面的表现。结果表明,首先,只有特定的对齐工具(FAVE和MFA)能够提供与训练变体中相当的对齐,并且在较小程度上与一般人类评分者间的不确定性相当。其次,即使是表现良好的对齐工具也会对其训练变体产生偏差:这些对齐工具系统性地对特立尼达英语特有的元音产生更多错误的对齐,因为它们没有针对这些元音的声学模型。研究结果表明,对新英语的语音研究可以从预训练的最先进对齐工具中受益,但通常可能需要进一步进行人工数据处理,以尽量减少非多数方言数据分析中的错误。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验