Rahmatallah Yasir, Kemp Aaron S, Iyer Anu, Pillai Lakshmi, Larson-Prior Linda J, Virmani Tuhin, Prior Fred
Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
Georgia Institute of Technology, Atlanta, 30332, USA.
Sci Rep. 2025 Mar 1;15(1):7337. doi: 10.1038/s41598-025-92105-6.
Machine learning approaches including deep learning models have shown promising performance in the automatic detection of Parkinson's disease. These approaches rely on different types of data with voice recordings being the most used due to the convenient and non-invasive nature of data acquisition. Our group has successfully developed a novel approach that uses convolutional neural network with transfer learning to analyze spectrogram images of the sustained vowel /a/ to identify people with Parkinson's disease. We tested this approach by collecting a dataset of voice recordings via analog telephone lines, which support limited bandwidth. The convolutional neural network with transfer learning approach showed superior performance against conventional machine learning methods that collapse measurements across time to generate feature vectors. This study builds upon our prior results and presents two novel contributions: First, we tested the performance of our approach on a larger voice dataset recorded using smartphones with wide bandwidth. Our results show comparable performance between two datasets generated using different recording platforms despite the differences in most important features resulting from the limited bandwidth of analog telephonic lines. Second, we compared the classification performance achieved using linear-scale and mel-scale spectrogram images and showed a small but statistically significant gain using mel-scale spectrograms.
包括深度学习模型在内的机器学习方法在帕金森病的自动检测中已显示出有前景的性能。这些方法依赖于不同类型的数据,由于数据采集方便且无创,语音记录是最常用的。我们团队成功开发了一种新颖的方法,该方法使用带有迁移学习的卷积神经网络来分析持续元音/a/的频谱图图像,以识别帕金森病患者。我们通过经由支持有限带宽的模拟电话线收集语音记录数据集来测试这种方法。与将跨时间的测量值合并以生成特征向量的传统机器学习方法相比,带有迁移学习的卷积神经网络方法表现出卓越的性能。本研究基于我们之前的结果,并提出了两个新的贡献:第一,我们在使用具有宽带宽的智能手机录制的更大语音数据集上测试了我们方法的性能。我们的结果表明,尽管模拟电话线的有限带宽导致了最重要特征存在差异,但使用不同录制平台生成的两个数据集之间的性能相当。第二,我们比较了使用线性尺度和梅尔尺度频谱图图像实现的分类性能,并表明使用梅尔尺度频谱图有虽小但在统计上显著的提升。