Seok Minje, Kim Wooseong
Computer Engineering Department, Gachon University, Seongnam 13120, Gyeonggi, Republic of Korea.
Healthcare (Basel). 2023 May 5;11(9):1334. doi: 10.3390/healthcare11091334.
Sarcopenia is a well-known age-related disease that can lead to musculoskeletal disorders and chronic metabolic syndromes, such as sarcopenic obesity. Numerous studies have researched the relationship between sarcopenia and various risk factors, leading to the development of predictive models based on these factors. In this study, we explored the impact of physical activity (PA) in daily life and obesity on sarcopenia prediction. PA is easier to measure using personal devices, such as smartphones and watches, or lifelogs, than using other factors that require medical equipment and examination. To demonstrate the feasibility of sarcopenia prediction using PA, we trained various machine learning models, including gradient boosting machine (GBM), xgboost (XGB), lightgbm (LGB), catboost (CAT), logistic regression, support vector classifier, k-nearest neighbors, random forest (RF), multi-layer perceptron, and deep neural network (DNN), using data samples from the Korea National Health and Nutrition Examination Survey. Among the models, the DNN achieved the most precise accuracy on average, 81%, with PA features across all data combinations, and the accuracy increased up to 90% with the addition of obesity information, such as total fat mass and fat percentage. Considering the difficulty of measuring the obesity feature, when adding waist circumference to the PA features, the DNN recorded the highest accuracy of 84%. This model accuracy could be improved by using separate training sets according to gender. As a result of measurement with various metrics for accurate evaluation of models, GBM, XGB, LGB, CAT, RF, and DNN demonstrated significant predictive performance using only PA features including waist circumference, with AUC values at least around 0.85 and often approaching or exceeding 0.9. We also found the key features for a highly performing model such as the quantified PA value and metabolic equivalent score in addition to a simple obesity measure such as body mass index (BMI) and waist circumference using SHAP analysis.
肌肉减少症是一种众所周知的与年龄相关的疾病,可导致肌肉骨骼疾病和慢性代谢综合征,如肌肉减少性肥胖。许多研究探讨了肌肉减少症与各种风险因素之间的关系,并基于这些因素开发了预测模型。在本研究中,我们探讨了日常生活中的身体活动(PA)和肥胖对肌肉减少症预测的影响。与使用需要医疗设备和检查的其他因素相比,使用智能手机、手表等个人设备或生活日志来测量PA更容易。为了证明使用PA进行肌肉减少症预测的可行性,我们使用韩国国家健康与营养检查调查的数据样本,训练了各种机器学习模型,包括梯度提升机(GBM)、XGBoost(XGB)、LightGBM(LGB)、CatBoost(CAT)、逻辑回归、支持向量分类器、k近邻、随机森林(RF)、多层感知器和深度神经网络(DNN)。在这些模型中,DNN在所有数据组合的PA特征上平均达到了最精确的准确率,即81%,并且在添加肥胖信息(如总脂肪量和脂肪百分比)后,准确率提高到了90%。考虑到测量肥胖特征的难度,当在PA特征中添加腰围时,DNN的准确率最高,为84%。根据性别使用单独的训练集可以提高该模型的准确率。通过使用各种指标进行测量以准确评估模型,GBM、XGB、LGB、CAT、RF和DNN仅使用包括腰围在内的PA特征就表现出了显著的预测性能,AUC值至少约为0.85,并且经常接近或超过0.9。我们还使用SHAP分析找到了高性能模型的关键特征,除了简单的肥胖测量指标(如体重指数(BMI)和腰围)外,还有量化的PA值和代谢当量得分。