Qasrawi Radwan, Ajab Abir, Cheikh Ismail Leila, Al Dhaheri Ayesha, Alblooshi Sharifa, Abu Ghoush Razan, Vicuna Polo Stephanny, Amro Malak, Thwib Suliman, Issa Ghada, Al Sabbah Haleama
Department of Computer Science, Al-Quds University, Jerusalem, Palestine.
Department of Computer Engineering, Istinye University, Istanbul, Türkiye.
Front Nutr. 2025 Jul 18;12:1574063. doi: 10.3389/fnut.2025.1574063. eCollection 2025.
Obesity and underweight are increasingly common among young adult women, often resulting from complex interactions between diet, lifestyle, and socioeconomic factors. This study addresses that gap by applying machine learning to a wide range of behavioral, dietary, and demographic data. The main research question asks: What are the key factors influencing weight status among female university students, and how accurately can machine learning models identify them? We hypothesize that different factors are significantly associated with underweight, overweight, and obesity, and that machine learning can reliably detect these patterns. The aim is to identify the strongest predictors and support more targeted weight management strategies.
This cross-sectional study analyzed data from 7,092 female university students (aged 18-30 years) in Palestine and the UAE. Sociodemographic, dietary, and lifestyle predictors were evaluated using machine learning (Random Forest, SVM, logistic regression, gradient boosting, decision trees, and ensemble methods). Synthetic Minority Over-sampling (SMOTE) addressed class imbalance. Model performance was assessed via 10-fold cross-validation, with significance determined by the chi-square test ( < 0.05, 95% CI).
The Random Forest model achieved the highest accuracy (obesity: 96.8%, underweight: 94.6%, overweight: 90.3%) and AUC (0.95-0.97). The main drivers of weight status categories were as follows: underweight was associated with low water/milk intake and preference for fast food; overweight with added oil, large eating quantity, and low physical activity; and obesity with energy drink consumption, salty snacks, and irregular meals. All findings were statistically significant ( < 0.001). Socio-demographic factors (e.g., low income and marital status) and lifestyle habits (e.g., sleep <5 h and fast eating) were also significantly related to weight status.
The integration of these findings into weight management frameworks can significantly enhance the detection and understanding of modifiable determinants, thereby informing public health interventions, guiding the development of targeted weight management strategies, and contributing to the global movement toward healthier bodies.
肥胖和体重不足在年轻成年女性中越来越普遍,这往往是饮食、生活方式和社会经济因素之间复杂相互作用的结果。本研究通过将机器学习应用于广泛的行为、饮食和人口统计学数据来填补这一空白。主要研究问题是:影响女大学生体重状况的关键因素有哪些,机器学习模型能多准确地识别这些因素?我们假设不同因素与体重不足、超重和肥胖显著相关,并且机器学习能够可靠地检测到这些模式。目的是确定最强的预测因素,并支持更有针对性的体重管理策略。
这项横断面研究分析了来自巴勒斯坦和阿联酋的7092名18至30岁女大学生的数据。使用机器学习(随机森林、支持向量机、逻辑回归、梯度提升、决策树和集成方法)评估社会人口统计学、饮食和生活方式预测因素。合成少数过采样技术(SMOTE)解决了类别不平衡问题。通过10折交叉验证评估模型性能,显著性由卡方检验确定(<0.05,95%置信区间)。
随机森林模型实现了最高的准确率(肥胖:96.8%,体重不足:94.6%,超重:90.3%)和AUC(0.95 - 0.97)。体重状况类别的主要驱动因素如下:体重不足与水/牛奶摄入量低和对快餐的偏好有关;超重与食用油添加量、食量过大和身体活动量低有关;肥胖与能量饮料消费、咸味零食和不规律饮食有关。所有结果均具有统计学显著性(<0.001)。社会人口统计学因素(如低收入和婚姻状况)和生活方式习惯(如睡眠<5小时和进食快)也与体重状况显著相关。
将这些发现整合到体重管理框架中,可以显著提高对可改变决定因素的检测和理解,从而为公共卫生干预提供信息,指导制定有针对性的体重管理策略,并为全球迈向更健康身体的运动做出贡献。