Ibarra Emiro J, Arias-Londoño Julián D, Godino-Llorente Juan I, Mehta Daryush D, Zañartu Matías
Department of Electronic Engineering and Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaiso, 2390123, Chile.
ETSI Telecomunicación, Universidad Politécnica de Madrid, Madrid, 28040, Spain.
Biomed Signal Process Control. 2025 Aug;106. doi: 10.1016/j.bspc.2025.107681. Epub 2025 Feb 26.
Subglottal air pressure is a critical physiologically-based parameter that reveals fundamental pathophysiological processes in patients with voice disorders. However, its assessment in both laboratory and ambulatory settings presents significant challenges due to the necessity for specialized instruments, invasive procedures, and the impracticality of direct measurement in ambulatory contexts. This study expands upon previous efforts to estimate subglottal pressure from portable, lightweight neck-surface acceleration signals using a physiologically relevant model of voice production combined with machine learning techniques. The proposed approach employs a neural network architecture initially trained with numerical simulations from the voice production model, which is subsequently refined through a domain adaptation strategy from synthetic data to laboratory data. This proposed method provides a means to create subject and group-specific refinements of the original neural network. For comprehensive comparisons with previous methods reported in the literature, the proposed approach is applied to both normal and disordered voices, including cases of unilateral vocal fold paralysis and phonotraumatic and non-phonotraumatic vocal hyperfunction. The study is divided into two datasets, encompassing a total of 135 participants. The recordings consist of synchronous measurements of oral airflow, intraoral pressure, and signals from a microphone and a neck-surface accelerometer. Each participant was asked to utter /p/-vowel syllable gestures with variations in loudness, vowels, pitch, and voice quality. Compared to previously reported approaches, the proposed method results in subject-specific models that achieve over a 21% improvement in the estimation of subglottal pressure, as measured by root mean square error. These findings underscore the effectiveness of a non-linear, subject-specific regression approach in enhancing the estimation of subglottal pressure from neck-surface vibration signals.
声门下气压是一个关键的基于生理学的参数,它揭示了嗓音障碍患者的基本病理生理过程。然而,由于需要专门的仪器、侵入性操作以及在动态环境中直接测量的不切实际性,在实验室和动态环境中对其进行评估都面临重大挑战。本研究扩展了先前的努力,利用与生理相关的语音产生模型结合机器学习技术,从便携式、轻便的颈部表面加速度信号估计声门下压力。所提出的方法采用了一种神经网络架构,最初使用语音产生模型的数值模拟进行训练,随后通过从合成数据到实验室数据的域适应策略进行优化。该方法提供了一种对原始神经网络进行特定于个体和群体的优化的手段。为了与文献中报道的先前方法进行全面比较,将所提出的方法应用于正常和紊乱嗓音,包括单侧声带麻痹、发声创伤性和非发声创伤性嗓音功能亢进的病例。该研究分为两个数据集,共有135名参与者。记录包括口腔气流、口腔内压力以及来自麦克风和颈部表面加速度计的信号的同步测量。要求每个参与者发出/p/元音音节手势,响度、元音、音高和嗓音质量有所变化。与先前报道的方法相比,所提出的方法产生了特定于个体的模型,通过均方根误差测量,在声门下压力估计方面实现了超过21%的改进。这些发现强调了非线性、特定于个体的回归方法在增强从颈部表面振动信号估计声门下压力方面的有效性。