用于语音产生中声源神经肌肉控制的深度学习

Deep Learning for Neuromuscular Control of Vocal Source for Voice Production.

作者信息

Palaparthi Anil, Alluri Rishi K, Titze Ingo R

机构信息

Utah Center for Vocology, University of Utah, Salt Lake City, UT 84112, USA.

School of Biological Sciences, University of Utah, Salt Lake City, UT 84112, USA.

出版信息

Appl Sci (Basel). 2024 Jan;14(2). doi: 10.3390/app14020769. Epub 2024 Jan 16.

DOI:10.3390/app14020769

PMID:39071945

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11281313/

Abstract

A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, , a biophysical computational model of the vocal system was used as the physical plant. In the , a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant/ǝ/vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation.

摘要

开发了一种计算神经肌肉控制系统，该系统可产生肺压和三种喉内肌激活（环甲肌、甲杓肌和环杓侧肌）来控制声源。在当前研究中，使用了一个语音系统的生物物理计算模型作为物理对象。在该模型中，使用了一个三质量声带模型来模拟声带的自持振荡。使用恒定的/ǝ/元音来模拟声道形状。气管是根据MRI测量结果建模的。神经肌肉控制系统生成控制参数，以实现四个声学目标（基频、声压级、归一化频谱质心和信噪比）和四个体感目标（声带长度以及三层声带中的纵向纤维应力）。基于深度学习的控制系统包括一个声学前馈控制器和两个反馈（声学和体感）控制器。使用该模型生成了50000个稳定的语音信号来训练控制系统。结果表明，该控制系统能够产生肺压和三种肌肉激活，从而高精度地实现四个声学目标和四个体感目标。训练后，除了甲杓肌激活外，与前馈控制器相比，反馈控制器的运动指令校正最小。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于语音产生中声源神经肌肉控制的深度学习

Deep Learning for Neuromuscular Control of Vocal Source for Voice Production.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

用于语音产生中声源神经肌肉控制的深度学习

Deep Learning for Neuromuscular Control of Vocal Source for Voice Production.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献