Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Samut Prakan, Thailand.
Department of Communication Sciences and Disorders, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand.
Sci Rep. 2021 Sep 27;11(1):19149. doi: 10.1038/s41598-021-98742-x.
Recently deep learning has attained a breakthrough in model accuracy for the classification of images due mainly to convolutional neural networks. In the present study, we attempted to investigate the presence of subclinical voice feature alteration in COVID-19 patients after the recent resolution of disease using deep learning. The study was a prospective study of 76 post COVID-19 patients and 40 healthy individuals. The diagnoses of post COVID-19 patients were based on more than the eighth week after onset of symptoms. Voice samples of an 'ah' sound, coughing sound and a polysyllabic sentence were collected and preprocessed to log-mel spectrogram. Transfer learning using the VGG19 pre-trained convolutional neural network was performed with all voice samples. The performance of the model using the polysyllabic sentence yielded the highest classification performance of all models. The coughing sound produced the lowest classification performance while the ability of the monosyllabic 'ah' sound to predict the recent COVID-19 fell between the other two vocalizations. The model using the polysyllabic sentence achieved 85% accuracy, 89% sensitivity, and 77% specificity. In conclusion, deep learning is able to detect the subtle change in voice features of COVID-19 patients after recent resolution of the disease.
最近,深度学习在图像分类的模型准确性方面取得了突破,主要归功于卷积神经网络。在本研究中,我们试图使用深度学习来研究 COVID-19 患者在疾病近期缓解后是否存在亚临床声音特征改变。该研究是一项对 76 名 COVID-19 后患者和 40 名健康个体的前瞻性研究。COVID-19 后患者的诊断基于症状发作后超过第八周。采集了“啊”声、咳嗽声和多音节句子的语音样本,并进行预处理以生成对数梅尔频谱图。使用所有语音样本进行了基于 VGG19 预训练卷积神经网络的迁移学习。使用多音节句子的模型表现出了所有模型中最高的分类性能。咳嗽声产生的分类性能最低,而单音节“啊”声预测最近 COVID-19 的能力介于其他两种声音之间。使用多音节句子的模型达到了 85%的准确率、89%的敏感度和 77%的特异性。总之,深度学习能够检测到 COVID-19 患者在疾病近期缓解后的声音特征的微妙变化。