Kim Jonathan C, Rao Hrishikesh, Clements Mark A
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332
J Acoust Soc Am. 2014 Oct;136(4):EL315-21. doi: 10.1121/1.4896410.
Head and neck cancer can significantly hamper speech production which often reduces speech intelligibility. A method of extracting spectral features is presented. The method uses a multi-resolution sinusoidal transform scheme, which enables better representation of spectral and harmonic characteristics. Regression methods were used to predict interval-scaled intelligibility scores of utterances in the NKI-CCRT speech corpus. The inclusion of these features lowered the mean squared estimation error from 0.43 to 0.39 on a scale from 1 to 7, with a p-value less than 0.001. For binary intelligibility classification, their inclusion resulted in an improvement by 5.0 percentage points when tested on a disjoint set.
头颈癌会严重妨碍言语产生,这通常会降低言语清晰度。本文提出了一种提取频谱特征的方法。该方法使用多分辨率正弦变换方案,能够更好地表示频谱和谐波特征。采用回归方法预测NKI-CCRT语音语料库中话语的区间标度清晰度分数。纳入这些特征后,在1到7的量表上,均方估计误差从0.43降至0.39,p值小于0.001。对于二元清晰度分类,在不相交集上进行测试时,纳入这些特征使准确率提高了5.0个百分点。