van Santen Jan P H, Prud'hommeaux Emily Tucker, Black Lois M
Center for Spoken Language Understanding, Division of Biomedical Computer Science, Oregon Health & Science University.
Speech Commun. 2009 Nov 1;51(11):1082-1097. doi: 10.1016/j.specom.2009.04.007.
Assessment of prosody is important for diagnosis and remediation of speech and language disorders, for diagnosis of neurological conditions, and for foreign language instruction. Current assessment is largely auditory-perceptual, which has obvious drawbacks; however, automation of assessment faces numerous obstacles. We propose methods for automatically assessing production of lexical stress, focus, phrasing, pragmatic style, and vocal affect. Speech was analyzed from children in six tasks designed to elicit specific prosodic contrasts. The methods involve dynamic and global features, using spectral, fundamental frequency, and temporal information. The automatically computed scores were validated against mean scores from judges who, in all but one task, listened to "prosodic minimal pairs" of recordings, each pair containing two utterances from the same child with approximately the same phonemic material but differing on a specific prosodic dimension, such as stress. The judges identified the prosodic categories of the two utterances and rated the strength of their contrast. For almost all tasks, we found that the automated scores correlated with the mean scores approximately as well as the judges' individual scores. Real-time scores assigned during examination - as is fairly typical in speech assessment - correlated substantially less than the automated scores with the mean scores.
韵律评估对于言语和语言障碍的诊断与矫治、神经疾病的诊断以及外语教学都很重要。当前的评估主要是听觉感知式的,存在明显缺陷;然而,评估的自动化面临诸多障碍。我们提出了自动评估词汇重音、焦点、措辞、语用风格和语音情感表达的方法。对参与六项任务的儿童的言语进行了分析,这些任务旨在引出特定的韵律对比。这些方法涉及利用频谱、基频和时间信息的动态和全局特征。将自动计算出的分数与评判员的平均分数进行了验证,除一项任务外,在其他所有任务中,评判员听的是录音的“韵律最小对”,每一对包含来自同一个孩子的两段话语,这两段话语具有大致相同的音素材料,但在特定的韵律维度(如重音)上有所不同。评判员确定两段话语的韵律类别,并对它们对比的强度进行评分。对于几乎所有任务,我们发现自动评分与平均分数的相关性与评判员的个人分数大致相当。在检查过程中实时给出的分数——这在言语评估中相当常见——与平均分数的相关性远低于自动评分。