Suppr超能文献

基于循环神经网络的喉压估计

Laryngeal Pressure Estimation With a Recurrent Neural Network.

作者信息

Gomez Pablo, Schutzenberger Anne, Semmler Marion, Dollinger Michael

机构信息

Division of Phoniatrics and Pediatric AudiologyDepartment of Otorhinolaryngology, Head and Neck SurgeryUniversity Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg91054ErlangenGermany.

出版信息

IEEE J Transl Eng Health Med. 2018 Dec 27;7:2000111. doi: 10.1109/JTEHM.2018.2886021. eCollection 2019.

Abstract

Quantifying the physical parameters of voice production is essential for understanding the process of phonation and can aid in voice research and diagnosis. As an alternative to invasive measurements, they can be estimated by formulating an inverse problem using a numerical forward model. However, high-fidelity numerical models are often computationally too expensive for this. This paper presents a novel approach to train a long short-term memory network to estimate the subglottal pressure in the larynx at massively reduced computational cost using solely synthetic training data. We train the network on synthetic data from a numerical two-mass model and validate it on experimental data from 288 high-speed video recordings of porcine vocal folds from a previous study. The training requires significantly fewer model evaluations compared with the previous optimization approach. On the test set, we maintain a comparable performance of 21.2% versus previous 17.7% mean absolute percentage error in estimating the subglottal pressure. The evaluation of one sample requires a vanishingly small amount of computation time. The presented approach is able to maintain estimation accuracy of the subglottal pressure at significantly reduced computational cost. The methodology is likely transferable to estimate other parameters and training with other numerical models. This improvement should allow the adoption of more sophisticated, high-fidelity numerical models of the larynx. The vast speedup is a critical step to enable a future clinical application and knowledge of parameters such as the subglottal pressure will aid in diagnosis and treatment selection.

摘要

量化发声的物理参数对于理解发声过程至关重要,并且有助于语音研究和诊断。作为侵入性测量的替代方法,可以通过使用数值正向模型来构建一个反问题来估计这些参数。然而,高保真数值模型对于此目的而言计算成本往往过高。本文提出了一种新颖的方法,即训练一个长短期记忆网络,仅使用合成训练数据,以大幅降低的计算成本来估计喉部的声门下压力。我们在来自数值双质量模型的合成数据上训练该网络,并在先前研究中288个猪声带高速视频记录的实验数据上对其进行验证。与先前的优化方法相比,训练所需的模型评估显著减少。在测试集上,我们在估计声门下压力时保持了21.2%的可比性能,而先前的平均绝对百分比误差为17.7%。对一个样本的评估所需的计算时间极少。所提出的方法能够在显著降低计算成本的情况下保持声门下压力的估计精度。该方法可能可转移用于估计其他参数,并与其他数值模型一起进行训练。这种改进应该允许采用更复杂、高保真的喉部数值模型。巨大的加速是实现未来临床应用的关键一步,并且诸如声门下压力等参数的知识将有助于诊断和治疗选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef9/6331197/180547d242d4/gomez1-2886021.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验