Suppr超能文献

基于语音技术的构音障碍语音清晰度评估。

Speech technology-based assessment of phoneme intelligibility in dysarthria.

机构信息

DRL-Head and Neck Surgery, Antwerp University Hospital, Antwerp, Belgium.

出版信息

Int J Lang Commun Disord. 2009 Sep-Oct;44(5):716-30. doi: 10.1080/13682820802342062.

Abstract

BACKGROUND

Currently, clinicians mainly rely on perceptual judgements to assess intelligibility of dysarthric speech. Although often highly reliable, this procedure is subjective with a lot of intrinsic variables. Therefore, certain benefits can be expected from a speech technology-based intelligibility assessment. Previous attempts to develop an automated intelligibility assessment mainly relied on automatic speech recognition (ASR) systems that were trained to recognize the speech of persons without known impairments. In this paper automatic speech alignment (ASA) systems are used instead. In addition, previous attempts only made use of phonemic features (PMF). However, since articulation is an important contributing factor to intelligibility of dysarthric speech and since phonological features (PLF) are shared by multiple phonemes, phonological features may be more appropriate to characterize and identify dysarthric phonemes.

AIMS

To investigate the reliability of objective phoneme intelligibility scores obtained by three types of intelligibility models: models using only phonemic features (yielded by an automated speech aligner) (PMF models), models using only phonological features (PLF models), and models using a combination of phonemic and phonological features (PMF + PLF models).

METHODS & PROCEDURES: Correlations were calculated between the objective phoneme intelligibility scores of 60 dysarthric speakers and the corresponding perceptual phoneme intelligibility scores obtained by a standardized perceptual phoneme intelligibility assessment.

OUTCOMES & RESULTS: The correlations between the objective and perceptual intelligibility scores range from 0.793 for the PMF models, over 0.828 for PLF models to 0.943 for PMF + PLF models. The features selected to obtain such high correlations can be divided into six main subgroups: (1) vowel-related phonemic and phonological features, (2) lateral-related features, (3) silence-related features, (4) fricative-related features, (5) velar-related features and (6) plosive-related features.

CONCLUSIONS & IMPLICATIONS: The phoneme intelligibility scores of dysarthric speakers obtained by the three investigated intelligibility model types are reliable. The highest correlation between the perceptual and objective intelligibility scores was found for models combining phonemic and phonological features. The intelligibility scoring system is now ready to be implemented in a clinical tool.

摘要

背景

目前,临床医生主要依靠知觉判断来评估构音障碍语音的可理解度。尽管这种方法通常非常可靠,但它是主观的,存在许多内在变量。因此,基于语音技术的可理解度评估可能会带来某些益处。之前开发自动可理解度评估的尝试主要依赖于经过训练以识别无明显障碍人士的语音的自动语音识别(ASR)系统。在本文中,使用自动语音对齐(ASA)系统代替了自动语音识别系统。此外,之前的尝试仅使用了音位特征(PMF)。然而,由于发音是构音障碍语音可理解度的一个重要贡献因素,并且由于音系特征(PLF)被多个音位共享,因此音系特征可能更适合于构音障碍音位的特征描述和识别。

目的

研究三种可理解度模型(仅使用音位特征(由自动语音对齐器产生)的模型(PMF 模型)、仅使用音系特征的模型(PLF 模型)和同时使用音位和音系特征的模型(PMF+PLF 模型)获得的客观音位可理解度得分的可靠性。

方法与程序

计算了 60 位构音障碍患者的客观音位可理解度得分与通过标准化感知音位可理解度评估获得的相应感知音位可理解度得分之间的相关性。

结果与结论

客观和感知可理解度得分之间的相关性范围从 PMF 模型的 0.793 到 PLF 模型的 0.828,再到 PMF+PLF 模型的 0.943。获得如此高相关性的特征可以分为六个主要子组:(1)元音相关的音位和音系特征;(2)侧音相关特征;(3)无声相关特征;(4)擦音相关特征;(5)软腭相关特征;(6)爆破音相关特征。

结论

通过三种研究的可理解度模型类型获得的构音障碍患者的音位可理解度得分是可靠的。在结合了音位和音系特征的模型中,感知和客观可理解度得分之间的相关性最高。可理解度评分系统现已准备好纳入临床工具。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验