Department of Otorhinolaryngology, University Hospital (Policlinico) Federico II of Naples, Via S. Pansini 5, Naples, Italy.
Institute of High Performance Computing and Networking (ICAR-CNR), Via Pietro Castellino 111, Naples, Italy.
Biomed Res Int. 2018 Aug 8;2018:8193694. doi: 10.1155/2018/8193694. eCollection 2018.
The current study presents a clinical evaluation of Vox4Health, an m-health system able to estimate the possible presence of a voice disorder by calculating and analyzing the main acoustic measures required for the acoustic analysis, namely, the Fundamental Frequency, jitter, shimmer, and Harmonic to Noise Ratio. The acoustic analysis is an objective, effective, and noninvasive tool used in clinical practice to perform a quantitative evaluation of voice quality.
A clinical study was carried out in collaboration with medical staff of the University of Naples Federico II. 208 volunteers were recruited (mean age, 44.2 ± 13.9 years), 58 healthy subjects (mean age, 36.7 ± 13.3 years) and 150 pathological ones (mean age, 47 ± 13.1 years). The evaluation of Vox4Health was made in terms of classification performance, i.e., sensitivity, specificity, and accuracy, by using a rule-based algorithm that considers the most characteristic acoustic parameters to classify if the voice is healthy or pathological. The performance has been compared with that achieved by using Praat, one of the most commonly used tools in clinical practice.
Using a rule-based algorithm, the best accuracy in the detection of voice disorders, 72.6%, was obtained by using the jitter or shimmer value. Moreover, the best sensitivity is about 96% and it was always obtained by using jitter. Finally, the best specificity was achieved by using the Fundamental Frequency and it is equal to 56.9%. Additionally, in order to improve the classification accuracy of the next version of the Vox4Health app, an evaluation by using machine learning techniques was conducted. We performed some preliminary tests adopting different machine learning techniques able to classify the voice as healthy or pathological. The best accuracy (77.4%) was obtained by the Logistic Model Tree algorithm, while the best sensitivity (99.3%) was achieved using the Support Vector Machine. Finally, Instance-based Learning performed the best specificity (36.2%).
Considering the achieved accuracy, Vox4Health has been considered by the medical experts as a "good screening tool" for the detection of voice disorders in its current version. However, this accuracy is improved when machine learning classifiers are considered rather than the rule-based algorithm.
本研究介绍了一种名为 Vox4Health 的移动医疗系统的临床评估,该系统能够通过计算和分析声学分析所需的主要声学测量值(即基频、抖动、颤抖和谐噪比)来估计声音障碍的可能存在。声学分析是一种客观、有效、非侵入性的工具,用于在临床实践中对语音质量进行定量评估。
与那不勒斯费德里克二世大学的医务人员合作进行了一项临床研究。共招募了 208 名志愿者(平均年龄 44.2 ± 13.9 岁),其中 58 名健康受试者(平均年龄 36.7 ± 13.3 岁)和 150 名病理受试者(平均年龄 47 ± 13.1 岁)。通过使用基于规则的算法评估 Vox4Health 的分类性能,即敏感性、特异性和准确性,该算法考虑了最具特征性的声学参数来判断声音是否健康或病理。并将性能与在临床实践中最常用的工具之一 Praat 的性能进行了比较。
使用基于规则的算法,通过使用抖动或颤抖值,在检测声音障碍方面获得了最佳的准确性(72.6%)。此外,灵敏度最高约为 96%,始终通过使用抖动获得。最后,通过使用基频获得了最佳的特异性(56.9%)。此外,为了提高 Vox4Health 应用程序的下一个版本的分类准确性,进行了使用机器学习技术的评估。我们采用了不同的机器学习技术进行了一些初步测试,这些技术能够将声音分类为健康或病理。通过逻辑模型树算法获得了最佳的准确性(77.4%),而通过支持向量机获得了最佳的灵敏度(99.3%)。最后,基于实例的学习获得了最佳的特异性(36.2%)。
考虑到所达到的准确性,Vox4Health 在当前版本中被医学专家认为是一种用于检测声音障碍的“良好筛查工具”。然而,当使用基于机器学习的分类器而不是基于规则的算法时,准确性会得到提高。