Haderlein Tino, Middag Catherine, Martens Jean-Pierre, Döllinger Michael, Nöth Elmar
Phoniatrische und pädaudiologische Abteilung, Universitätsklinikum Erlangen, Erlangen, Germany.
Folia Phoniatr Logop. 2014;66(6):219-26. doi: 10.1159/000365969. Epub 2015 Jan 31.
Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers. The research questions are here: is it possible to construct suitable acoustic features that generalize to other languages and a speech disorder, and is the generated model for intelligibility also suitable for specific subtypes of that disorder, i.e. functional and organic dysphonia?
73 German-speaking persons with chronic hoarseness read the text 'Der Nordwind und die Sonne'. Perceptual intelligibility scores were used as ground truth during the training of an automatic model that converts speaker level acoustic measurements into intelligibility scores. Cross-validation is used to assess model performance.
The interrater agreement for all patients (n = 73) and for the functional and organic dysphonia subgroups (n = 45 and n = 24) are r = 0.82, r = 0.83 and r = 0.75, respectively. The automatic assessment based on phonologically based acoustic models revealed correlations between perceptual and automatic intelligibility ratings of r = 0.79 (all patients), r = 0.78 (functional dysphonia) and r = 0.80 (organic dysphonia).
The automatic, objective measurement of intelligibility is a valuable instrument in an evidence-based clinical practice.
使用自动语音识别进行的自动可懂度评估通常是特定于语言的。在本研究中,提出了一种与语言无关的方法。它使用用佛兰芒语语音训练的模型,并应用于评估长期嗓音嘶哑的德语使用者。这里的研究问题是:是否有可能构建适用于其他语言和言语障碍的合适声学特征,以及生成的可懂度模型是否也适用于该障碍的特定亚型,即功能性和器质性发音障碍?
73名患有慢性嗓音嘶哑的德语使用者朗读了文本《北风与太阳》。在将说话者水平的声学测量转换为可懂度分数的自动模型训练过程中,使用感知可懂度分数作为基准真值。采用交叉验证来评估模型性能。
所有患者(n = 73)以及功能性和器质性发音障碍亚组(n = 45和n = 24)的评分者间一致性分别为r = 0.82、r = 0.83和r = 0.75。基于音系学声学模型的自动评估显示,感知可懂度评分与自动可懂度评分之间的相关性为r = 0.79(所有患者)、r = 0.78(功能性发音障碍)和r = 0.80(器质性发音障碍)。
可懂度的自动、客观测量是循证临床实践中的一项有价值的工具。