Knowles Thea, Clayards Meghan, Sonderegger Morgan
School of Communication Sciences & Disorders, Western University, London, Ontario, Canada.
Health & Rehabilitation Sciences, Western University, London, Ontario, Canada.
J Speech Lang Hear Res. 2018 Oct 26;61(10):2487-2501. doi: 10.1044/2018_JSLHR-S-17-0275.
Heterogeneous child speech was force-aligned to investigate whether (a) manipulating specific parameters could improve alignment accuracy and (b) forced alignment could be used to replicate published results on acoustic characteristics of /s/ production by children.
In Part 1, child speech from 2 corpora was force-aligned with a trainable aligner (Prosodylab-Aligner) under different conditions that systematically manipulated input training data and the type of transcription used. Alignment accuracy was determined by comparing hand and automatic alignments as to how often they overlapped (%-Match) and absolute differences in duration and boundary placements. Using mixed-effects regression, accuracy was modeled as a function of alignment conditions, as well as segment and child age. In Part 2, forced alignments derived from a subset of the alignment conditions in Part 1 were used to extract spectral center of gravity of /s/ productions from young children. These findings were compared to published results that used manual alignments of the same data.
Overall, the results of Part 1 demonstrated that using training data more similar to the data to be aligned as well as phonetic transcription led to improvements in alignment accuracy. Speech from older children was aligned more accurately than younger children. In Part 2, /s/ center of gravity extracted from force-aligned segments was found to diverge in the speech of male and female children, replicating the pattern found in previous work using manually aligned segments. This was true even for the least accurate forced alignment method.
Alignment accuracy of child speech can be improved by using more specific training and transcription. However, poor alignment accuracy was not found to impede acoustic analysis of /s/ produced by even very young children. Thus, forced alignment presents a useful tool for the analysis of child speech.
将儿童异质语音进行强制对齐,以研究(a)操纵特定参数是否能提高对齐精度,以及(b)强制对齐是否可用于复制已发表的关于儿童/s/发音声学特征的研究结果。
在第一部分中,来自2个语料库的儿童语音在不同条件下使用可训练对齐器(Prosodylab - Aligner)进行强制对齐,这些条件系统地操纵了输入训练数据和所用转录类型。通过比较人工对齐和自动对齐在重叠频率(%-匹配)以及时长和边界位置的绝对差异来确定对齐精度。使用混合效应回归,将精度建模为对齐条件以及音段和儿童年龄的函数。在第二部分中,利用第一部分中部分对齐条件得出的强制对齐结果,从幼儿的/s/发音中提取频谱重心。将这些结果与使用相同数据的人工对齐得出的已发表结果进行比较。
总体而言,第一部分的结果表明,使用与待对齐数据更相似的训练数据以及语音转录可提高对齐精度。年龄较大儿童的语音比对年龄较小儿童的语音对齐得更准确。在第二部分中,从强制对齐音段中提取的/s/重心在男童和女童的语音中存在差异,这与之前使用人工对齐音段的研究结果一致。即使是最不准确的强制对齐方法也是如此。
通过使用更具体的训练和转录可以提高儿童语音的对齐精度。然而,即使对于非常年幼的儿童,对齐精度较差也并未妨碍对其/s/发音的声学分析。因此,强制对齐是分析儿童语音的一种有用工具。