Liu Boquan, Polce Evan, Raj Hayley, Jiang Jack
1 Department of Surgery-Division of Otolaryngology, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA.
Ann Otol Rhinol Laryngol. 2019 Oct;128(10):921-931. doi: 10.1177/0003489419848451. Epub 2019 May 14.
Signal typing has been used to categorize healthy and disordered voices; however, human voices are likely comprised of differing proportions of periodic type 1 elements, type 2 elements that are periodic with modulations, aperiodic type 3 elements, and stochastic type 4 elements. A novel diffusive chaos method is presented to detect the distribution of voice types within a signal with the goal of providing an objective and clinically useful tool for evaluating the voice. It was predicted that continuous calculation of the diffusive chaos parameter throughout the voice sample would allow for construction of comprehensive voice type component profiles (VTCP).
One hundred thirty-five voice samples of sustained /a/ vowels were randomly selected from the Disordered Voice Database Model 4337. All samples were classified according to the voice type paradigm using spectrogram analysis, yielding 34 type 1, 35 type 2, 42 type 3, and 24 type 4 voice samples. All samples were then analyzed using the diffusive chaos method, and VTCPs were generated to show the distribution of the 4 voice type components (VTC).
The proportions of VTC varied significantly between the majority of the traditional voice types ( < .001). Three of the 4 VTCs of type 3 voices were significantly different from the VTCs of type 4 voices ( < .001). These results were compared to calculations of spectrum convergence ratio, which did not vary significantly between voice types 1 and 2 or 2 and 3.
The diffusive chaos method demonstrates proficiency in generating comprehensive VTCPs for disordered voices with varying severity. In contrast to acoustic parameters that provide a single measure of disorder, VTCPs can be used to detect subtler changes by observing variations in each VTC over time. This method also provides the advantage of quantifying stochastic noise components that are due to breathiness in the voice.
信号分型已被用于对健康和紊乱的嗓音进行分类;然而,人类嗓音可能由不同比例的周期性1型元素、带有调制的周期性2型元素、非周期性3型元素和随机性4型元素组成。本文提出一种新的扩散混沌方法来检测信号中嗓音类型的分布,目的是为嗓音评估提供一种客观且临床有用的工具。据预测,在整个嗓音样本中持续计算扩散混沌参数将有助于构建全面的嗓音类型成分剖面图(VTCP)。
从紊乱嗓音数据库模型4337中随机选取135个持续发/a/元音的嗓音样本。使用频谱图分析根据嗓音类型范式对所有样本进行分类,得到34个1型、35个2型、42个3型和24个4型嗓音样本。然后使用扩散混沌方法对所有样本进行分析,并生成VTCP以显示4种嗓音类型成分(VTC)的分布。
大多数传统嗓音类型之间的VTC比例差异显著(<.001)。3型嗓音的4个VTC中有3个与4型嗓音的VTC显著不同(<.001)。将这些结果与频谱收敛率的计算结果进行比较,频谱收敛率在1型和2型或2型和3型嗓音之间没有显著差异。
扩散混沌方法在为不同严重程度的紊乱嗓音生成全面的VTCP方面表现出优势。与提供单一紊乱测量值的声学参数不同,VTCP可通过观察每个VTC随时间的变化来检测更细微的变化。该方法还具有量化因嗓音呼吸声导致的随机噪声成分的优势。