Nakahara Yoshiki, Mabu Shingo, Hirano Tsunahiko, Murata Yoriyuki, Doi Keiko, Fukatsu-Chikumoto Ayumi, Matsunaga Kazuto
Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi 7558611, Japan.
Department of Respiratory Medicine and Infectious Disease, Yamaguchi University Hospital, Yamaguchi 7558505, Japan.
J Clin Med. 2023 Jun 27;12(13):4297. doi: 10.3390/jcm12134297.
Contracting COPD reduces a patient's physical activity and restricts everyday activities (physical activity disorder). However, the fundamental cause of physical activity disorder has not been found. In addition, costly and specialized equipment is required to accurately examine the disorder; hence, it is not regularly assessed in normal clinical practice. In this study, we constructed a machine learning model to predict physical activity using test items collected during the normal care of COPD patients. In detail, we first applied three types of data preprocessing methods (zero-padding, multiple imputation by chained equations (MICE), and k-nearest neighbor (kNN)) to complement missing values in the dataset. Then, we constructed several types of neural networks to predict physical activity. Finally, permutation importance was calculated to identify the importance of the test items for prediction. Multifactorial analysis using machine learning, including blood, lung function, walking, and chest imaging tests, was the unique point of this research. From the experimental results, it was found that the missing value processing using MICE contributed to the best prediction accuracy (73.00%) compared to that using zero-padding (68.44%) or kNN (71.52%), and showed better accuracy than XGBoost (66.12%) with a significant difference ( < 0.05). For patients with severe physical activity reduction (total exercise < 1.5), a high sensitivity (89.36%) was obtained. The permutation importance showed that "sex, the number of cigarettes, age, and the whole body phase angle (nutritional status)" were the most important items for this prediction. Furthermore, we found that a smaller number of test items could be used in ordinary clinical practice for the screening of physical activity disorder.
患慢性阻塞性肺疾病(COPD)会降低患者的身体活动能力并限制日常活动(身体活动障碍)。然而,身体活动障碍的根本原因尚未找到。此外,准确检查该障碍需要昂贵且专业的设备;因此,在正常临床实践中不会定期对其进行评估。在本研究中,我们构建了一个机器学习模型,使用慢性阻塞性肺疾病患者常规护理期间收集的测试项目来预测身体活动。具体而言,我们首先应用了三种数据预处理方法(零填充、链式方程多重填补(MICE)和k近邻(kNN))来补充数据集中的缺失值。然后,我们构建了几种类型的神经网络来预测身体活动。最后,计算排列重要性以确定测试项目对预测的重要性。使用机器学习进行多因素分析,包括血液、肺功能、步行和胸部成像测试,是本研究的独特之处。从实验结果发现,与使用零填充(68.44%)或kNN(71.52%)相比,使用MICE进行缺失值处理有助于获得最佳预测准确率(73.00%),并且比XGBoost(66.12%)具有更高的准确率,差异显著(<0.05)。对于身体活动严重减少(总运动量<1.5)的患者,获得了较高的灵敏度(89.36%)。排列重要性表明,“性别、吸烟数量、年龄和全身相位角(营养状况)”是该预测中最重要的项目。此外,我们发现,在普通临床实践中,可以使用较少数量的测试项目来筛查身体活动障碍。