Department of Otolaryngology - Head & Neck Surgery, Graduate School of Medicine, Kyoto University, Kyoto, Japan; Department of Otolaryngology - Head & Neck Surgery, Kurashiki Central Hospital, Okayama, Japan.
Department of Otolaryngology - Head & Neck Surgery, Kurashiki Central Hospital, Okayama, Japan.
J Voice. 2022 Nov;36(6):770-776. doi: 10.1016/j.jvoice.2020.08.026. Epub 2020 Sep 18.
Cepstral analysis does not require the detection of pitch within waveforms, which makes it suitable for acoustic evaluation of connected speech contexts and severely disordered voice. Although the utility of cepstral measurements, including cepstral peak prominence (CPP) and cepstral spectral index of dysphonia (CSID), has been reported for several languages, it has yet to be demonstrated in the Japanese language. The current study aimed to investigate the utility of cepstral acoustic analysis for the Japanese language as an indicator of dysphonia and the degree of dysphonia severity.
Ninety-five patients with dysphonia and thirty volunteers without voice complaint uttered the sustained vowel /a/ and read four Japanese sentences designed to elicit different laryngeal behaviors. The recorded voice samples were evaluated perceptually by three raters according to the GRBAS scale (grade) and overall severity (OS) on a visual analog scale. Participants were then divided into four groups based on grade and OS: non-, mildly, moderately, and severely dysphonic groups. For the acoustic analysis, CPP and CSID were computed using the Analysis of Dysphonia in Speech and Voice, while jitter percentage (Jitt), shimmer percentage (Shim), and noise to harmonic ratio were computed using the Multi-Dimensional Voice Program.
Statistical analysis revealed that both CPP and CSID differed significantly between all groups, except for grade between the non-dysphonic and mildly dysphonic groups. Pearson correlation analysis between the acoustic measurements and the perceptual ratings revealed that the absolute correlation coefficients for CPP, CSID, and Jitt were greater than 0.7. Specifically, those for CPP and CSID were greater than 0.8 for OS. Receiver operating characteristic curve analysis showed that the AUC for CPP, CSID, Jitt, and Shim was greater than 0.8 for both grade and OS. The cut-off values for CPP and CSID, as determined by the Youden Index, were 6.74-7.18 and 12.16-20.39, respectively.
The current study demonstrated the validity of CPP and CSID as indicators of dysphonia and indices of dysphonia severity in the Japanese language.
声道倒频谱分析不需要检测波形中的基音,因此适用于连接语音环境和严重嗓音障碍的声学评估。尽管已经报道了包括声道倒频谱峰值突出度(CPP)和声道倒频谱不谐指数(CSID)在内的声道倒频谱测量在几种语言中的应用,但尚未在日语中得到验证。本研究旨在探讨声道倒频谱分析在日语中的有效性,作为嗓音障碍和嗓音障碍严重程度的指标。
95 名嗓音障碍患者和 30 名无嗓音抱怨的志愿者发出持续元音/a/并朗读四个设计用于诱发不同声带行为的日语句子。录制的语音样本由三名评估员根据 GRBAS 量表(等级)和视觉模拟量表上的整体严重程度(OS)进行感知评估。然后根据等级和 OS 将参与者分为四组:非、轻度、中度和重度嗓音障碍组。对于声学分析,使用语音和嗓音的发声障碍分析(Analysis of Dysphonia in Speech and Voice)计算 CPP 和 CSID,而使用多维嗓音程序(Multi-Dimensional Voice Program)计算 Jitt、Shim 和噪声与谐波比。
统计分析显示,除了非嗓音障碍组和轻度嗓音障碍组之间的等级外,CPP 和 CSID 在所有组之间均有显著差异。声学测量与感知评估之间的 Pearson 相关分析显示,CPP、CSID 和 Jitt 的绝对相关系数均大于 0.7。特别是 CPP 和 CSID 与 OS 的绝对相关系数均大于 0.8。受试者工作特征曲线分析显示,CPP、CSID、Jitt 和 Shim 的 AUC 对于等级和 OS 均大于 0.8。通过 Youden 指数确定的 CPP 和 CSID 的截断值分别为 6.74-7.18 和 12.16-20.39。
本研究表明 CPP 和 CSID 可作为日语嗓音障碍和嗓音障碍严重程度的有效指标。