Huo Hua, Jiao Shupei, Li Dongfang, Ma Lan, Xu Ningya
Henan University of Science and Technology, LuoYang, China.
PLoS One. 2025 Apr 2;20(4):e0319826. doi: 10.1371/journal.pone.0319826. eCollection 2025.
The diagnosis of Parkinson's disease relies heavily on the subjective assessment of physicians, which depends on their individual experience and training, potentially leading to inconsistent diagnostic results. Therefore, developing an objective and efficient diagnostic method is essential to improve the accuracy and timeliness of Parkinson's disease diagnosis. In this study, we utilized the PhysioNet dataset, a time-series dataset comprising data from 93 Parkinson's patients and 73 healthy individuals. The dataset contains vertical ground reaction forces recorded from 16 sensors (8 per foot) during a 2-minute test at a sampling rate of 100 Hz. To address challenges such as limited dataset size, high labeling noise, and high intra-class variability, we performed data preprocessing and applied various data augmentation techniques, including jittering, scaling, rotation, permutation, magnitude warping, time warping, cropping, and linear residuals. These methods were evaluated using one-dimensional-convolutional neural network (1D-ConvNet) and one-dimensional Transformer networks. By conducting 10-fold cross-validation, we observed significant improvements in classification performance. The best data augmentation strategy achieved 90.8% accuracy, 92.0% precision, 91.0% recall, and a 91.0% F1 score in assessing disease severity. These results highlight the importance of selecting appropriate data augmentation techniques for time-series data to improve model generalization and diagnostic reliability, while also offering new insights for researchers working with sensor device data. Our results demonstrate that data-enhanced methods can significantly boost the performance of machine-learning models in the field of Parkinson's disease diagnosis.
帕金森病的诊断在很大程度上依赖于医生的主观评估,这取决于他们的个人经验和培训,可能导致诊断结果不一致。因此,开发一种客观、高效的诊断方法对于提高帕金森病诊断的准确性和及时性至关重要。在本研究中,我们使用了PhysioNet数据集,这是一个时间序列数据集,包含93名帕金森病患者和73名健康个体的数据。该数据集包含在2分钟测试期间以100Hz采样率从16个传感器(每只脚8个)记录的垂直地面反作用力。为了应对数据集规模有限、标记噪声高和类内变异性高的挑战,我们进行了数据预处理,并应用了各种数据增强技术,包括抖动、缩放、旋转、排列、幅度扭曲、时间扭曲、裁剪和线性残差。使用一维卷积神经网络(1D-ConvNet)和一维Transformer网络对这些方法进行了评估。通过进行10折交叉验证,我们观察到分类性能有了显著提高。在评估疾病严重程度时,最佳数据增强策略实现了90.8%的准确率、92.0%的精确率、91.0%的召回率和91.0%的F1分数。这些结果突出了为时间序列数据选择合适的数据增强技术以提高模型泛化能力和诊断可靠性的重要性,同时也为处理传感器设备数据的研究人员提供了新的见解。我们的结果表明,数据增强方法可以显著提高帕金森病诊断领域机器学习模型的性能。