Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India.
J Acoust Soc Am. 2019 Dec;146(6):4211. doi: 10.1121/1.5134433.
The presence of hypernasality in repaired cleft palate (CP) speech is a consequence of velopharyngeal insufficiency. The coupling of the nasal tract with the oral tract adds nasal formant and antiformant pairs in the hypernasal speech spectrum. This addition deviates the spectral and linear prediction (LP) residual characteristics of hypernasal speech compared to normal speech. In this work, the vocal tract constriction feature, peak to side-lobe ratio feature, and spectral moment features augmented by low-order cepstral coefficients are used to capture the spectral and residual deviations for hypernasality detection. The first feature captures the lower-frequencies prominence in speech due to the presence of nasal formants, the second feature captures the undesirable signal components in the residual signal due to the nasal antiformants, and the third feature captures the information about formants and antiformants in the spectrum along with the spectral envelope. The combination of three features gives normal versus hypernasal speech detection accuracies of 87.76%, 91.13%, and 93.70% for /a/, /i/, and /u/ vowels, respectively, and hypernasality severity detection accuracies of 80.13% and 81.25% for /i/ and /u/ vowels, respectively. The speech data are collected from 30 control normal and 30 repaired CP children between the ages of 7 and 12.
修复腭裂(CP)语音中的过度鼻音是软腭功能不全的结果。鼻腔声道与口腔声道的耦合在过度鼻音语音频谱中增加了鼻腔共振峰和反共振峰对。与正常语音相比,这种增加导致了过度鼻音语音的谱和线性预测(LP)残差特征发生偏差。在这项工作中,使用声道收缩特征、峰值与旁瓣比特征以及由低阶倒谱系数增强的谱矩特征来捕捉过度鼻音检测的谱和残差偏差。第一个特征捕获了由于存在鼻腔共振峰而导致语音中低频突出的情况,第二个特征捕获了由于鼻腔反共振峰而导致的残余信号中不理想的信号分量,第三个特征则捕获了谱中的共振峰和反共振峰以及谱包络的信息。三个特征的组合分别为 /a/、/i/ 和 /u/ 元音提供了正常与过度鼻音语音检测的准确率为 87.76%、91.13% 和 93.70%,以及对 /i/ 和 /u/ 元音的过度鼻音严重程度检测的准确率为 80.13%和 81.25%。语音数据是从 30 名年龄在 7 至 12 岁的正常对照组和 30 名修复 CP 的儿童中收集的。