University of Wisconsin-Madison School of Medicine and Public Health, Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Madison, Wisconsin.
University of Wisconsin-Madison School of Medicine and Public Health, Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, Madison, Wisconsin.
J Voice. 2022 Jan;36(1):21-26. doi: 10.1016/j.jvoice.2020.03.011. Epub 2020 May 29.
Acoustic analysis is a commonly used method for quantitatively measuring vocal fold function. The accuracy of acoustic analysis depends upon the operator selecting a stable segment of the voice sample to analyze. This paper proposes a novel method to more accurately and reliably select a stable voice segment.
Four selection methods were implemented to evaluate each raw audio signal and determine the most stable segment of each signal: The proposed modal periodogram method, the moving window method, the midvowel method, and the whole vowel method. Acoustic parameters of interest-namely perturbation (jitter), correlation dimension (D2), and spectrum convergence ratio (SCR)-were calculated for 48 phonation samples to evaluate each method.
The proposed modal periodogram method utilizes a minimum mean-square error based approach to calculate a stable modal periodogram and obtain the most stable segment. The Wilcoxon Signed-Rank test was used to compare jitter, D2, and SCR values acquired using the modal periodogram method against the current standard segment selection methods.
The modal periodogram method yielded significantly lower D2 values, and a significantly higher SCR for both normal and disordered voice samples (P < 0.01). This indicates that the modal periodogram method is more apt for selecting a stable audio segment than the other selection methods.
声学分析是一种常用于定量测量声带功能的方法。声学分析的准确性取决于操作人员选择稳定的语音样本段进行分析。本文提出了一种更准确、可靠地选择稳定语音段的新方法。
为了评估每个原始音频信号并确定每个信号的最稳定段,实施了四种选择方法:所提出的模态周期图方法、移动窗口方法、中元音方法和全元音方法。为了评估每种方法,对 48 个发音样本计算了感兴趣的声学参数,即微扰(抖动)、关联维数(D2)和频谱收敛比(SCR)。
所提出的模态周期图方法利用基于最小均方误差的方法来计算稳定的模态周期图,并获得最稳定的段。使用 Wilcoxon 符号秩检验比较使用模态周期图方法获得的抖动、D2 和 SCR 值与当前标准段选择方法的差异。
模态周期图方法对正常和异常嗓音样本的 D2 值明显较低,而 SCR 值明显较高(P <0.01)。这表明模态周期图方法比其他选择方法更适合选择稳定的音频段。