Zhang Zihao, Zhao Dechun, Wang Ziqiong, Wei Li
Automation College, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China.
Biomedical Information College, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):17-25. doi: 10.7507/1001-5515.202304011.
Parkinson's disease patients have early vocal cord damage, and their voiceprint characteristics differ significantly from those of healthy individuals, which can be used to identify Parkinson's disease. However, the samples of the voiceprint dataset of Parkinson's disease patients are insufficient, so this paper proposes a double self-attention deep convolutional generative adversarial network model for sample enhancement to generate high-resolution spectrograms, based on which deep learning is used to recognize Parkinson's disease. This model improves the texture clarity of samples by increasing network depth and combining gradient penalty and spectral normalization techniques, and a family of pure convolutional neural networks (ConvNeXt) classification network based on Transfer learning is constructed to extract voiceprint features and classify them, which improves the accuracy of Parkinson's disease recognition. The validation experiments of the effectiveness of this paper's algorithm are carried out on the Parkinson's disease speech dataset. Compared with the pre-sample enhancement, the clarity of the samples generated by the proposed model in this paper as well as the Fréchet inception distance (FID) are improved, and the network model in this paper is able to achieve an accuracy of 98.8%. The results of this paper show that the Parkinson's disease recognition algorithm based on double self-attention deep convolutional generative adversarial network sample enhancement can accurately distinguish between healthy individuals and Parkinson's disease patients, which helps to solve the problem of insufficient samples for early recognition of voiceprint data in Parkinson's disease. In summary, the method effectively improves the classification accuracy of small-sample Parkinson's disease speech dataset and provides an effective solution idea for early Parkinson's disease speech diagnosis.
帕金森病患者存在早期声带损伤,其声纹特征与健康个体有显著差异,可用于识别帕金森病。然而,帕金森病患者声纹数据集的样本不足,因此本文提出一种双自注意力深度卷积生成对抗网络模型进行样本增强以生成高分辨率频谱图,并在此基础上利用深度学习识别帕金森病。该模型通过增加网络深度并结合梯度惩罚和谱归一化技术提高样本的纹理清晰度,构建了基于迁移学习的纯卷积神经网络(ConvNeXt)分类网络族来提取声纹特征并进行分类,提高了帕金森病识别的准确率。本文算法有效性的验证实验在帕金森病语音数据集上进行。与样本增强前相比,本文提出的模型生成的样本清晰度以及Fréchet初始距离(FID)均得到改善,本文的网络模型能够达到98.8%的准确率。本文结果表明,基于双自注意力深度卷积生成对抗网络样本增强的帕金森病识别算法能够准确区分健康个体和帕金森病患者,有助于解决帕金森病声纹数据早期识别样本不足的问题。综上所述,该方法有效提高了小样本帕金森病语音数据集的分类准确率,为帕金森病早期语音诊断提供了有效的解决思路。