Hireš Máté, Gazda Matej, Drotár Peter, Pah Nemuel Daniel, Motin Mohammod Abdul, Kumar Dinesh Kant
Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia.
Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia.
Comput Biol Med. 2022 Feb;141:105021. doi: 10.1016/j.compbiomed.2021.105021. Epub 2021 Nov 9.
The computerized detection of Parkinson's disease (PD) will facilitate population screening and frequent monitoring and provide a more objective measure of symptoms, benefiting both patients and healthcare providers. Dysarthria is an early symptom of the disease and examining it for computerized diagnosis and monitoring has been proposed. Deep learning-based approaches have advantages for such applications because they do not require manual feature extraction, and while this approach has achieved excellent results in speech recognition, its utilization in the detection of pathological voices is limited. In this work, we present an ensemble of convolutional neural networks (CNNs) for the detection of PD from the voice recordings of 50 healthy people and 50 people with PD obtained from PC-GITA, a publicly available database. We propose a multiple-fine-tuning method to train the base CNN. This approach reduces the semantical gap between the source task that has been used for network pretraining and the target task by expanding the training process by including training on another dataset. Training and testing were performed for each vowel separately, and a 10-fold validation was performed to test the models. The performance was measured by using accuracy, sensitivity, specificity and area under the ROC curve (AUC). The results show that this approach was able to distinguish between the voices of people with PD and those of healthy people for all vowels. While there were small differences between the different vowels, the best performance was when/a/was considered; we achieved 99% accuracy, 86.2% sensitivity, 93.3% specificity and 89.6% AUC. This shows that the method has potential for use in clinical practice for the screening, diagnosis and monitoring of PD, with the advantage that vowel-based voice recordings can be performed online without requiring additional hardware.
帕金森病(PD)的计算机化检测将有助于人群筛查和频繁监测,并提供更客观的症状测量方法,这对患者和医疗服务提供者都有益。构音障碍是该疾病的早期症状,有人提议对其进行计算机化诊断和监测。基于深度学习的方法在此类应用中具有优势,因为它们不需要手动特征提取,虽然这种方法在语音识别方面取得了优异的成果,但其在病理性语音检测中的应用有限。在这项工作中,我们提出了一个卷积神经网络(CNN)集成模型,用于从50名健康人和50名帕金森病患者的语音记录中检测帕金森病,这些语音记录来自公开可用的PC-GITA数据库。我们提出了一种多重微调方法来训练基础CNN。这种方法通过在另一个数据集上进行训练来扩展训练过程,从而缩小了用于网络预训练的源任务和目标任务之间的语义差距。对每个元音分别进行训练和测试,并进行10折交叉验证以测试模型。通过准确率、灵敏度、特异性和ROC曲线下面积(AUC)来衡量性能。结果表明,该方法能够区分帕金森病患者和健康人的语音,对所有元音均有效。虽然不同元音之间存在细微差异,但在考虑/a/时性能最佳;我们实现了99%的准确率、86.2%的灵敏度、93.3%的特异性和89.6%的AUC。这表明该方法在帕金森病的筛查、诊断和监测的临床实践中具有应用潜力,其优点是基于元音的语音记录可以在线进行,无需额外的硬件。