Department of Communication Sciences and Disorders, Montclair State University, Montclair, New Jersey.
Department of Otolaryngology-Head and Neck Surgery, New York State University, New York, New York.
J Voice. 2021 Sep;35(5):703-716. doi: 10.1016/j.jvoice.2020.01.026. Epub 2020 Mar 12.
Smoothed cepstral peak prominence (CPPs) has been shown to be an effective indicator of breathiness (Hillenbrand and Houde, 1996). High-speed videoendoscopy (HSV) is frequently being used as a complement to stroboscopy especially when asymmetric or aperiodic vocal fold vibration is present in dysphonic voices. In an HSV image data set obtained with normal (nondisordered) voice subjects, we have observed that some degree of asymmetry is present in many of the vocal fold displacement curves extracted from the HSV exam videos; therefore, we have used this data set for a pilot study to investigate the relationship of CPPs to cyclical vocal fold vibration parameters, including left-right vocal fold (LVRF) phase asymmetry, in subjects with normal (nondisordered) voices.
Twenty subjects with normal (nondisordered) voices produced sustained vowel phonations while undergoing a transoral HSV examination of the vocal folds with synchronized recording of the voice signal. Glottal area waveform (GAW) and cyclical parameters open quotient (OQ), closed quotient (CQ), speed quotient (SQ), and LVRF skew were extracted from the HSV exam videos, and CPPs measures were obtained from acoustic analysis of the audio recordings. Correlations among the cyclical parameters and CPPs values were investigated using machine learning with the Regression Learner application in the MATLAB© Statistics and Machine Learning Toolbox (version 9.5.0.944444, R2018b, August 28, 2018, (c) 1984-2018, The MathWorks, Inc., Natick, MA).
Because the sample size of the data set used for this study was small, and because there possibly was multicollinearity among the predictor variables used, the only meaningful result that was obtained with the data set of 20 normal subjects in the four predictor variables was the constant model (ie, the best prediction of CPPs was just the average value of the 20 observations), when the model validation feature of the app was turned on to protect against overfitting. In order to fully investigate the usefulness of the Regression Learner App, however, the validation feature was turned off and 48 more model types were investigated. While these were not necessarily indicative of the best regression model for the current data set, the results obtained in this manner nevertheless demonstrated the utility of the automated approach for finding a regression model for a larger data set to be collected in the future.
Further work is warranted to collect a data set from a larger sample size of disordered voice patients with breathy and/or rough voice. It is speculated that a correlation between CPPs and cyclical parameters of vocal fold vibration may be more evident with disordered voices, because there will be more asymmetry in LRVF displacement with an effect on the acoustic voice signal.
已证明平滑倒谱峰值突出度(CPPs)是用于指示声门漏气的有效指标(Hillenbrand 和 Houde,1996 年)。高速视频内镜(HSV)经常被用作频闪喉镜的补充,特别是在存在声门不规则或周期性声带振动的发声障碍声音中。在使用正常(无紊乱)语音受试者获得的 HSV 图像数据集,我们观察到,从 HSV 检查视频中提取的许多声带位移曲线都存在一定程度的不对称性;因此,我们使用该数据集进行了一项初步研究,以调查 CPPs 与周期性声带振动参数之间的关系,包括左右声带(LVRF)相位不对称,在正常(无紊乱)语音受试者中。
20 名正常(无紊乱)语音受试者在接受经口 HSV 检查声带的同时发出持续元音发音,并对声带信号进行同步记录。从 HSV 检查视频中提取声门区面积波形(GAW)和周期性参数开商(OQ)、闭商(CQ)、速商(SQ)和 LVRF 偏斜,从音频记录的声学分析中获取 CPPs 测量值。使用 MATLAB©统计和机器学习工具箱(版本 9.5.0.944444,R2018b,2018 年 8 月 28 日,(c)1984-2018,MathWorks,Inc.,马萨诸塞州纳蒂克)中的 Regression Learner 应用程序,使用机器学习研究循环参数和 CPPs 值之间的相关性。
由于本研究中使用的数据集的样本量很小,并且由于使用的预测变量之间可能存在多重共线性,因此仅在有 20 名正常受试者的四个预测变量的数据集中获得了有意义的结果,即常数模型(即 CPPs 的最佳预测只是 20 次观察的平均值),当打开应用程序的模型验证功能以防止过拟合时。然而,为了充分研究 Regression Learner App 的有用性,关闭了验证功能,并研究了 48 种更多的模型类型。虽然这些模型不一定表示当前数据集的最佳回归模型,但以这种方式获得的结果仍然证明了自动化方法对于为将来收集的更大数据集找到回归模型的有用性。
需要进一步的工作来从更多患有呼吸音和/或粗糙声的发声障碍患者中收集更大样本量的数据集。有人推测,CPPs 与声带振动的周期性参数之间的相关性可能在发声障碍声音中更为明显,因为左右声带的位移会更不对称,从而影响到声学语音信号。