Zhang Zhaoyan
Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehab Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794,
J Acoust Soc Am. 2020 Mar;147(3):EL264. doi: 10.1121/10.0000927.
The goal of this study is to estimate vocal fold geometry, stiffness, position, and subglottal pressure from voice acoustics, toward clinical and other voice technology applications. Unlike previous voice inversion research that often uses lumped-element models of phonation, this study explores the feasibility of voice inversion using data generated from a three-dimensional voice production model. Neural networks are trained to estimate vocal fold properties and subglottal pressure from voice features extracted from the simulation data. Results show reasonably good estimation accuracy, particularly for vocal fold properties with a consistent global effect on voice production, and reasonable agreement with excised human larynx experiment.
本研究的目标是从语音声学估计声带的几何形状、硬度、位置和声门下压力,以用于临床和其他语音技术应用。与以往常使用发声集总元件模型的语音反演研究不同,本研究探索了使用三维语音产生模型生成的数据进行语音反演的可行性。训练神经网络从模拟数据中提取的语音特征来估计声带特性和声门下压力。结果显示出相当不错的估计精度,特别是对于对语音产生具有一致全局影响的声带特性,并且与切除的人体喉部实验结果具有合理的一致性。