Lin Jon Z, Espinoza Víctor M, Marks Katherine L, Zañartu Matías, Mehta Daryush D
Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114 USA.
Department of Sound, Universidad de Chile, Santiago, Chile.
IEEE J Sel Top Signal Process. 2020 Feb;14(2):449-460. doi: 10.1109/jstsp.2019.2959267. Epub 2019 Dec 12.
Subglottal air pressure plays a major role in voice production and is a primary factor in controlling voice onset, offset, sound pressure level, glottal airflow, vocal fold collision pressures, and variations in fundamental frequency. Previous work has shown promise for the estimation of subglottal pressure from an unobtrusive miniature accelerometer sensor attached to the anterior base of the neck during typical modal voice production across multiple pitch and vowel contexts. This study expands on that work to incorporate additional accelerometer-based measures of vocal function to compensate for non-modal phonation characteristics and achieve an improved estimation of subglottal pressure. Subjects with normal voices repeated /p/-vowel syllable strings from loud-to-soft levels in multiple vowel contexts (/ɑ/, /i/, and /u/), pitch conditions (comfortable, lower than comfortable, higher than comfortable), and voice quality types (modal, breathy, strained, and rough). Subject-specific, stepwise regression models were constructed using root-mean-square (RMS) values of the accelerometer signal alone (baseline condition) and in combination with cepstral peak prominence, fundamental frequency, and glottal airflow measures derived using subglottal impedance-based inverse filtering. Five-fold cross-validation assessed the robustness of model performance using the root-mean-square error metric for each regression model. Each cross-validation fold exhibited up to a 25% decrease in prediction error when the model incorporated multidimensional aspects of the accelerometer signal compared with RMS-only models. Improved estimation of subglottal pressure for non-modal phonation was thus achievable, lending to future studies of subglottal pressure estimation in patients with voice disorders and in ambulatory voice recordings.
声门下气压在发声过程中起主要作用,是控制声音起始、终止、声压级、声门气流、声带碰撞压力以及基频变化的主要因素。先前的研究表明,在多种音高和元音语境下的典型模态发声过程中,通过附着在颈部前部基部的微型加速度计传感器可以实现对声门下压力的无创估计。本研究在此基础上进行拓展,纳入了基于加速度计的更多发声功能测量指标,以补偿非模态发声特征,从而实现对声门下压力的更精确估计。嗓音正常的受试者在多种元音语境(/ɑ/、/i/和/u/)、音高条件(舒适音高、低于舒适音高、高于舒适音高)以及嗓音质量类型(模态、呼吸声、紧张和粗糙)下,从大声到轻声重复/p/-元音音节串。使用加速度计信号的均方根(RMS)值单独构建受试者特异性逐步回归模型(基线条件),并结合基于声门下阻抗逆滤波得到的谐波峰值突出度、基频和声门气流测量指标。采用五折交叉验证,使用每个回归模型的均方根误差指标评估模型性能的稳健性。与仅使用RMS的模型相比,当模型纳入加速度计信号的多维特征时,每次交叉验证折的预测误差最多可降低25%。因此,对于非模态发声,可以实现对声门下压力的更精确估计,这有助于未来对嗓音障碍患者和声门动态录音中的声门下压力估计进行研究。