Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago 8370451, Chile.
Clinical Hospital, University of Chile, Santiago 8380420, Chile.
Sensors (Basel). 2023 Feb 22;23(5):2441. doi: 10.3390/s23052441.
In this paper, a system to assess dyspnea with the mMRC scale, on the phone, via deep learning, is proposed. The method is based on modeling the spontaneous behavior of subjects while pronouncing controlled phonetization. These vocalizations were designed, or chosen, to deal with the stationary noise suppression of cellular handsets, to provoke different rates of exhaled air, and to stimulate different levels of fluency. Time-independent and time-dependent engineered features were proposed and selected, and a k-fold scheme with double validation was adopted to select the models with the greatest potential for generalization. Moreover, score fusion methods were also investigated to optimize the complementarity of the controlled phonetizations and features that were engineered and selected. The results reported here were obtained from 104 participants, where 34 corresponded to healthy individuals and 70 were patients with respiratory conditions. The subjects' vocalizations were recorded with a telephone call (i.e., with an IVR server). The system provided an accuracy of 59% (i.e., estimating the correct mMRC), a root mean square error equal to 0.98, false positive rate of 6%, false negative rate of 11%, and an area under the ROC curve equal to 0.97. Finally, a prototype was developed and implemented, with an ASR-based automatic segmentation scheme, to estimate dyspnea on line.
本文提出了一种通过深度学习在手机上使用 mMRC 量表评估呼吸困难的系统。该方法基于对受试者在发音时的自发行为进行建模,这些发声是为了应对蜂窝电话的静态噪声抑制而设计或选择的,以引起不同的呼气速度,并刺激不同程度的流畅性。提出并选择了时间独立和时间相关的工程特征,并采用 k 折方案和双重验证来选择具有最大泛化潜力的模型。此外,还研究了评分融合方法,以优化经过工程设计和选择的受控发声和特征的互补性。这里报告的结果来自 104 名参与者,其中 34 名是健康个体,70 名是患有呼吸系统疾病的患者。受试者的发声是通过电话(即使用 IVR 服务器)录制的。该系统的准确率为 59%(即估计正确的 mMRC),均方根误差等于 0.98,假阳性率为 6%,假阴性率为 11%,ROC 曲线下面积为 0.97。最后,开发并实现了一个基于 ASR 的自动分段方案的原型,以在线估计呼吸困难。