Choe Ju-Pil, Lee Seungbak, Kang Minsoo
Health and Sport Analytics Laboratory, Department of Health, Exercise Science, and Recreation Management, The University of Mississippi, University, 38677, USA.
Sci Rep. 2025 Feb 15;15(1):5650. doi: 10.1038/s41598-025-90077-1.
This study aims to create predictive models for PA guidelines by using ML and examine the critical determinants influencing adherence to the PA guidelines. 11,638 entries from the National Health and Nutrition Examination Survey were analyzed. Variables were categorized into demographic, anthropometric, and lifestyle categories. 18 prediction models were created by 6 ML algorithms and evaluated via accuracy, F1 score, and area under the curve (AUC). Additionally, we employed permutation feature importance (PFI) to assess the variable significance in each model. The decision tree using all variables emerged as the most effective method in the prediction for PA guidelines (accuracy = 0.705, F1 score = 0.819, and AUC = 0.542). Based on the PFI, sedentary behavior, age, gender, and educational status were the most important variables. These results highlight the possibilities of using data-driven methods with ML in PA research. Our analysis also identified crucial variables, providing valuable insights for targeted interventions aimed at enhancing individuals' adherence to PA guidelines.
本研究旨在通过使用机器学习创建针对身体活动(PA)指南的预测模型,并检验影响对PA指南依从性的关键决定因素。对来自国家健康与营养检查调查的11638条记录进行了分析。变量分为人口统计学、人体测量学和生活方式类别。通过6种机器学习算法创建了18个预测模型,并通过准确率、F1分数和曲线下面积(AUC)进行评估。此外,我们采用排列特征重要性(PFI)来评估每个模型中变量的重要性。使用所有变量的决策树在PA指南预测中成为最有效的方法(准确率 = 0.705,F1分数 = 0.819,AUC = 0.542)。基于PFI,久坐行为、年龄、性别和教育程度是最重要的变量。这些结果凸显了在PA研究中使用机器学习的数据驱动方法的可能性。我们的分析还确定了关键变量,为旨在提高个人对PA指南依从性的针对性干预提供了有价值的见解。