Rehman Rana Zia Ur, Guan Yu, Shi Jian Qing, Alcock Lisa, Yarnall Alison J, Rochester Lynn, Del Din Silvia
Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, United Kingdom.
School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom.
Front Aging Neurosci. 2022 Mar 22;14:808518. doi: 10.3389/fnagi.2022.808518. eCollection 2022.
Parkinson's disease (PD) is a common neurodegenerative disease. PD misdiagnosis can occur in early stages. Gait impairment in PD is typical and is linked with an increased fall risk and poorer quality of life. Applying machine learning (ML) models to real-world gait has the potential to be more sensitive to classify PD compared to laboratory data. Real-world gait yields multiple walking bouts (WBs), and selecting the optimal method to aggregate the data (e.g., different WB durations) is essential as this may influence classification performance. The objective of this study was to investigate the impact of environment (laboratory vs. real world) and data aggregation on ML performance for optimizing sensitivity of PD classification. Gait assessment was performed on 47 people with PD (age: 68 ± 9 years) and 52 controls [Healthy controls (HCs), age: 70 ± 7 years]. In the laboratory, participants walked at their normal pace for 2 min, while in the real world, participants were assessed over 7 days. In both environments, 14 gait characteristics were evaluated from one tri-axial accelerometer attached to the lower back. The ability of individual gait characteristics to differentiate PD from HC was evaluated using the Area Under the Curve (AUC). ML models (i.e., support vector machine, random forest, and ensemble models) applied to real-world gait showed better classification performance compared to laboratory data. Real-world gait characteristics aggregated over longer WBs (WB 30-60 s, WB > 60 s, WB > 120 s) resulted in superior discriminative performance (PD vs. HC) compared to laboratory gait characteristics (0.51 ≤ AUC ≤ 0.77). Real-world gait speed showed the highest AUC of 0.77. Overall, random forest trained on 14 gait characteristics aggregated over WBs > 60 s gave better performance (F1 score = 77.20 ± 5.51%) as compared to laboratory results (F1 Score = 68.75 ± 12.80%). Findings from this study suggest that the choice of environment and data aggregation are important to achieve maximum discrimination performance and have direct impact on ML performance for PD classification. This study highlights the importance of a harmonized approach to data analysis in order to drive future implementation and clinical use.
[09/H0906/82].
帕金森病(PD)是一种常见的神经退行性疾病。PD在早期可能会被误诊。PD患者的步态障碍很典型,且与跌倒风险增加和生活质量下降有关。与实验室数据相比,将机器学习(ML)模型应用于现实世界中的步态,在对PD进行分类时可能更具敏感性。现实世界中的步态会产生多个步行时段(WB),选择最佳的数据汇总方法(例如,不同的WB持续时间)至关重要,因为这可能会影响分类性能。本研究的目的是调查环境(实验室与现实世界)和数据汇总对ML性能的影响,以优化PD分类的敏感性。对47名PD患者(年龄:68±9岁)和52名对照者[健康对照者(HC),年龄:70±7岁]进行了步态评估。在实验室中,参与者以正常速度行走2分钟,而在现实世界中,参与者在7天内接受评估。在两种环境中,均使用一个附着在下背部的三轴加速度计评估了14种步态特征。使用曲线下面积(AUC)评估个体步态特征区分PD与HC的能力。与实验室数据相比,应用于现实世界步态的ML模型(即支持向量机、随机森林和集成模型)表现出更好的分类性能。与实验室步态特征(0.51≤AUC≤0.77)相比,汇总较长WB(WB 30 - 60秒、WB>60秒、WB>120秒)的现实世界步态特征具有更好的判别性能(PD与HC)。现实世界中的步态速度显示出最高的AUC,为0.77。总体而言,与实验室结果(F1分数 = 68.75±12.80%)相比,基于汇总超过60秒的WB的14种步态特征训练的随机森林表现更好(F1分数 = 77.20±5.51%)。本研究结果表明,环境和数据汇总的选择对于实现最大判别性能很重要,并且对PD分类的ML性能有直接影响。本研究强调了采用统一的数据分析方法以推动未来实施和临床应用的重要性。
[09/H0906/82]