Ghiasi Hafezi Somayeh, Ghasemabadi Atena, Soleimani Negar, Allahyari Maryam, Moradi Mina, Mansoori Amin, Kolahi Ahari Rana, Ghamsary Mark, Ferns Gordon, Esmaily Habibollah, Ghayour-Mobarhan Majid
International UNESCO center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, Iran.
Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran.
Popul Health Metr. 2025 Aug 19;23(1):48. doi: 10.1186/s12963-025-00410-z.
Dyslipidemia as a modifiable risk factor for chronic non-communicable diseases has become a worldwide concern. We aim to explore different anthropometric measures as predictors of dyslipidemia using various machine learning methods.
From the baseline of the Mashhad Stroke and Heart Atherosclerotic Disorder (MASHAD) study, a total of 9,640 participants were included in the analysis. Among them, 1,388 participants did not have dyslipidemia, while 8,252 participants had dyslipidemia. Various anthropometric indices were examined, including waist-to-height ratio (WHtR), body roundness index (BRI), abdominal volume index (AVI), weight-adjusted waist index (WWI), lipid accumulation product (LAP), visceral adiposity index (VAI), conicity index (C-index), body surface area (BSA), body adiposity index (BAI), and waist-to-hip ratio (WHR). The association between these indices and dyslipidemia was assessed using logistic regression (LR), decision tree (DT), random forest (RF), neural networks (NN), K-nearest neighbors (KNN), and eXtreme Gradient Boosting (XGBoost) models.
Based on our LR model, we found that several factors included, BAI, BSA, age, and WHR were significant. For example, for each unit increase in WHR, the odds of dyslipidemia increase by 9 time (OR = 90.29, 95%CI (4.09,21.08)). Additionally, our DT model indicated that BMI was the most influential predictor, followed by age and WHR. The LR model outperforms other models with the highest accuracy (0.89) and AUC-ROC score (0.89), showing strong ability to classify dyslipidemia cases. Feature importance analysis reveals variables like "BSA" contribute differently across models, with XGBoost relying more on it than LR. LR's balanced performance makes it the best choice.
The findings from machine learning models were in agreement, highlighting the significance of BMI, WHR, BSA, and BAI as key anthropometric indices for predicting dyslipidemia. These indices consistently emerged as strong predictors underscoring their importance in assessing the risk of dyslipidemia.
血脂异常作为慢性非传染性疾病的一个可改变的风险因素,已成为全球关注的问题。我们旨在使用各种机器学习方法探索不同的人体测量指标作为血脂异常的预测因素。
从马什哈德中风和心脏动脉粥样硬化疾病(MASHAD)研究的基线数据中,共有9640名参与者纳入分析。其中,1388名参与者没有血脂异常,而8252名参与者患有血脂异常。检查了各种人体测量指标,包括腰高比(WHtR)、身体圆润指数(BRI)、腹部容积指数(AVI)、体重调整腰围指数(WWI)、脂质积聚产物(LAP)、内脏脂肪指数(VAI)、锥度指数(C指数)、体表面积(BSA)、身体脂肪指数(BAI)和腰臀比(WHR)。使用逻辑回归(LR)、决策树(DT)、随机森林(RF)、神经网络(NN)、K近邻(KNN)和极端梯度提升(XGBoost)模型评估这些指标与血脂异常之间的关联。
基于我们的LR模型,我们发现包括BAI、BSA、年龄和WHR在内的几个因素具有显著性。例如,WHR每增加一个单位,血脂异常的几率增加9倍(OR = 90.29,95%CI(4.09,21.08))。此外,我们的DT模型表明BMI是最有影响力的预测因素,其次是年龄和WHR。LR模型以最高的准确率(0.89)和AUC-ROC得分(0.89)优于其他模型,显示出对血脂异常病例的强大分类能力。特征重要性分析表明,“BSA”等变量在不同模型中的贡献不同,XGBoost比LR更依赖它。LR的平衡性能使其成为最佳选择。
机器学习模型的研究结果一致,突出了BMI、WHR、BSA和BAI作为预测血脂异常的关键人体测量指标的重要性。这些指标一直是强有力的预测因素,强调了它们在评估血脂异常风险中的重要性。