Costantini Giovanni, Dr Valerio Cesarini, Robotti Carlo, Benazzo Marco, Pietrantonio Filomena, Di Girolamo Stefano, Pisani Antonio, Canzi Pietro, Mauramati Simone, Bertino Giulia, Cassaniti Irene, Baldanti Fausto, Saggio Giovanni
Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy.
Department of Otolaryngology - Head and Neck Surgery, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy.
Knowl Based Syst. 2022 Oct 11;253:109539. doi: 10.1016/j.knosys.2022.109539. Epub 2022 Jul 28.
Alongside the currently used nasal swab testing, the COVID-19 pandemic situation would gain noticeable advantages from low-cost tests that are available at any-time, anywhere, at a large-scale, and with real time answers. A novel approach for COVID-19 assessment is adopted here, discriminating negative subjects versus positive or recovered subjects. The scope is to identify potential discriminating features, highlight mid and short-term effects of COVID on the voice and compare two custom algorithms. A pool of 310 subjects took part in the study; recordings were collected in a low-noise, controlled setting employing three different vocal tasks. Binary classifications followed, using two different custom algorithms. The first was based on the coupling of boosting and bagging, with an AdaBoost classifier using Random Forest learners. A feature selection process was employed for the training, identifying a subset of features acting as clinically relevant biomarkers. The other approach was centered on two custom CNN architectures applied to mel-Spectrograms, with a custom knowledge-based data augmentation. Performances, evaluated on an independent test set, were comparable: Adaboost and CNN differentiated COVID-19 positive from negative with accuracies of 100% and 95% respectively, and recovered from negative individuals with accuracies of 86.1% and 75% respectively. This study highlights the possibility to identify COVID-19 positive subjects, foreseeing a tool for on-site screening, while also considering recovered subjects and the effects of COVID-19 on the voice. The two proposed novel architectures allow for the identification of biomarkers and demonstrate the ongoing relevance of traditional ML versus deep learning in speech analysis.
除了目前使用的鼻拭子检测外,新冠疫情形势将从低成本检测中获得显著优势,这种检测可以随时随地大规模进行,并能实时给出结果。本文采用了一种新的新冠评估方法,区分阴性受试者与阳性或康复受试者。其目的是识别潜在的区分特征,突出新冠对嗓音的中长期影响,并比较两种定制算法。310名受试者参与了这项研究;在低噪音、可控环境下收集了他们的录音,采用了三种不同的发声任务。随后进行二元分类,使用两种不同的定制算法。第一种基于提升和装袋的结合,使用基于随机森林学习器的AdaBoost分类器。在训练过程中采用了特征选择过程,识别出作为临床相关生物标志物的特征子集。另一种方法以应用于梅尔频谱图的两种定制卷积神经网络架构为中心,并采用基于知识的定制数据增强。在独立测试集上评估的性能相当:AdaBoost和卷积神经网络分别以100%和95%的准确率区分新冠阳性和阴性,分别以86.1%和75%的准确率区分康复者和阴性个体。这项研究突出了识别新冠阳性受试者的可能性,预见了一种现场筛查工具,同时也考虑了康复受试者以及新冠对嗓音的影响。所提出的两种新颖架构能够识别生物标志物,并证明了传统机器学习与深度学习在语音分析中的持续相关性。