Nematollahi Mohammad Ali, Joloudari Javad Hassannataj, Zare Omid, Maftoun Mohammad, Shadkam Nima, Sharifrazi Danial, Alizadehsani Roohallah, Asadollahi Arefeh
Department of Computer Sciences, Fasa University, Fasa, Iran.
Department of Computer Engineering, Faculty of Engineering, University of Birjand, Birjand, Iran.
J Diabetes Metab Disord. 2025 Jun 17;24(2):151. doi: 10.1007/s40200-025-01661-1. eCollection 2025 Dec.
Diabetes is known as a chronic illness with severe consequences. The rising morbidity rates predict a stunning growth in the global diabetes population, approaching 642 million by 2040, implying that one out of every ten people will be affected. This worrying number highlights the critical need for collaborative efforts from industry and academics to accelerate innovation and foster growth in diabetes risk prediction, eventually saving lives. As the frequency of life-threatening diseases, such as diabetes, rises, Medical Decision Support Systems (MDSS) continue to prove their usefulness in supporting healthcare professionals, particularly physicians, in clinical decision-making procedures. Due to the advancement of technology, machine-learning techniques have made headlines in the early prediction of diabetes. In this paper, we employed machine learning techniques and the Analysis of Variance (ANOVA) method to explore associations between regional body fat distribution and diabetes mellitus in a community adult population, aiming to assess predictive capabilities. We used individual standard classifiers and ensemble learning methods to conduct a retrospective analysis of a portion of data based on body composition. To address the class imbalance problem in the target variable, we also applied three oversampling methods to provide more accurate predictions via learning algorithms. The results demonstrate that XGBoost, based on the Adaptive Synthetic Sampling (ADASYN) method, outperforms the state-of-the-art by achieving an accuracy value of 92.04%. This model exhibits more effectiveness for diabetes prediction compared to other models.
糖尿病是一种后果严重的慢性病。发病率的不断上升预示着全球糖尿病患者人数将惊人增长,到2040年将接近6.42亿,这意味着每十人中就有一人会受到影响。这一令人担忧的数字凸显了行业和学术界共同努力加速创新并推动糖尿病风险预测发展以最终挽救生命的迫切需求。随着糖尿病等危及生命疾病的发病率上升,医疗决策支持系统(MDSS)在支持医疗保健专业人员,尤其是医生进行临床决策过程中持续证明了其有用性。由于技术的进步,机器学习技术在糖尿病的早期预测方面成为了头条新闻。在本文中,我们采用机器学习技术和方差分析(ANOVA)方法来探索社区成年人群中局部体脂分布与糖尿病之间的关联,旨在评估预测能力。我们使用个体标准分类器和集成学习方法对基于身体成分的部分数据进行回顾性分析。为了解决目标变量中的类不平衡问题,我们还应用了三种过采样方法,通过学习算法提供更准确的预测。结果表明,基于自适应合成采样(ADASYN)方法的XGBoost模型以92.04%的准确率超越了现有最佳模型。与其他模型相比,该模型在糖尿病预测方面表现出更高的有效性。