Georgia Institute of Technology, Atlanta, 30332, USA.
Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
Sci Rep. 2023 Nov 23;13(1):20615. doi: 10.1038/s41598-023-47568-w.
Machine learning approaches have been used for the automatic detection of Parkinson's disease with voice recordings being the most used data type due to the simple and non-invasive nature of acquiring such data. Although voice recordings captured via telephone or mobile devices allow much easier and wider access for data collection, current conflicting performance results limit their clinical applicability. This study has two novel contributions. First, we show the reliability of personal telephone-collected voice recordings of the sustained vowel /a/ in natural settings by collecting samples from 50 people with specialist-diagnosed Parkinson's disease and 50 healthy controls and applying machine learning classification with voice features related to phonation. Second, we utilize a novel application of a pre-trained convolutional neural network (Inception V3) with transfer learning to analyze the spectrograms of the sustained vowel from these samples. This approach considers speech intensity estimates across time and frequency scales rather than collapsing measurements across time. We show the superiority of our deep learning model for the task of classifying people with Parkinson's disease as distinct from healthy controls.
机器学习方法已被用于自动检测帕金森病,由于获取此类数据的简单性和非侵入性,语音记录是最常用的数据类型。虽然通过电话或移动设备录制的语音记录允许更轻松、更广泛地进行数据收集,但目前相互矛盾的性能结果限制了它们的临床适用性。本研究有两个新颖的贡献。首先,我们通过从 50 名专家诊断为帕金森病的患者和 50 名健康对照者中收集样本,并应用与发声相关的语音特征进行机器学习分类,展示了在自然环境下通过个人电话采集的持续元音 /a/ 的语音记录的可靠性。其次,我们利用预训练的卷积神经网络(Inception V3)的一种新的应用,即迁移学习,来分析这些样本中持续元音的声谱图。这种方法考虑了语音强度在时间和频率尺度上的估计,而不是在时间上进行测量的合并。我们展示了我们的深度学习模型在将帕金森病患者与健康对照者进行分类任务中的优越性。