College of Physical Education, Yangzhou University, Yangzhou, 225127, China.
School of Sport and Brain Health, Nanjing Sport Institute, Nanjing, 210014, China.
BMC Public Health. 2024 Nov 1;24(1):3034. doi: 10.1186/s12889-024-20510-z.
Overweight and obesity pose a huge burden on individuals and society. While the relationship between lifestyle factors and overweight and obesity is well-established, the relative contribution of specific lifestyle factors remains unclear. To address this gap in the literature, this study utilizes interpretable machine learning methods to identify the relative importance of specific lifestyle factors as predictors of overweight and obesity in adults.
Data were obtained from 46,057 adults in the China Health and Nutrition Survey (2004-2011) and the National Health and Nutrition Examination Survey (2007-2014). Basic demographic information, self-reported lifestyle factors, including physical activity, macronutrient intake, tobacco and alcohol consumption, and body weight status were collected. Three machine learning models, namely decision tree, random forest, and gradient-boosting decision tree, were employed to predict body weight status from lifestyle factors. The SHapley Additive exPlanation (SHAP) method was used to interpret the prediction results of the best-performing model by determining the contributions of specific lifestyle factors to the development of overweight and obesity in adults.
The performance of the gradient-boosting decision tree model outperformed the decision tree and random forest models. Analysis based on the SHAP method indicates that sedentary behavior, alcohol consumption, and protein intake were important lifestyle factors predicting the development of overweight and obesity in adults. The amount of alcohol consumption and time spent sedentary were the strongest predictors of overweight and obesity, respectively. Specifically, sedentary behavior exceeding 28-35 h/week, alcohol consumption of more than 7 cups/week, and protein intake exceeding 80 g/day increased the risk of being predicted as overweight and obese.
Pooled evidence from two nationally representative studies suggests that recognizing demographic differences and emphasizing the relative importance of sedentary behavior, alcohol consumption, and protein intake are beneficial for managing body weight status in adults. The specific risk thresholds for lifestyle factors observed in this study can help inform and guide future research and public health actions.
超重和肥胖给个人和社会带来了巨大的负担。虽然生活方式因素与超重和肥胖之间的关系已经得到充分证实,但特定生活方式因素的相对贡献仍不清楚。为了解决文献中的这一空白,本研究利用可解释的机器学习方法来确定特定生活方式因素作为成年人超重和肥胖预测指标的相对重要性。
数据来自中国健康与营养调查(2004-2011 年)和全国健康与营养调查(2007-2014 年)的 46057 名成年人。收集了基本人口统计学信息、自我报告的生活方式因素,包括体力活动、宏量营养素摄入、烟酒消费以及体重状况。采用决策树、随机森林和梯度提升决策树三种机器学习模型,从生活方式因素预测体重状况。利用 SHapley Additive exPlanation(SHAP)方法,通过确定特定生活方式因素对成年人超重和肥胖发展的贡献,来解释表现最佳的模型的预测结果。
梯度提升决策树模型的性能优于决策树和随机森林模型。基于 SHAP 方法的分析表明,久坐行为、饮酒和蛋白质摄入是预测成年人超重和肥胖的重要生活方式因素。饮酒量和久坐时间是预测超重和肥胖的最强因素。具体来说,每周久坐超过 28-35 小时、每周饮酒超过 7 杯和每天蛋白质摄入超过 80 克,都会增加被预测为超重和肥胖的风险。
两项具有全国代表性的研究综合证据表明,认识到人口统计学差异并强调久坐行为、饮酒和蛋白质摄入的相对重要性,有利于管理成年人的体重状况。本研究观察到的生活方式因素的具体风险阈值可以为未来的研究和公共卫生行动提供信息和指导。