Alfalki Ali Mamoon
College of Health Professions, University of New England, Biddeford, ME 04005, USA.
Curr Diabetes Rev. 2025;21(3):35-46. doi: 10.2174/1573399820666230605160212.
Diabetes Mellitus is a chronic health condition (long-lasting) due to inadequate control of blood levels of glucose. This study presents a prediction of Type 2 Diabetes Mellitus among women using various Machine Learning Algorithms deployed to predict the diabetic condition. A University of California Irvine Diabetes Mellitus Dataset posted in Kaggle was used for analysis.
The dataset included eight risk factors for Type 2 Diabetes Mellitus prediction, including Age, Systolic Blood Pressure, Glucose, Body Mass Index, Insulin, Skin Thickness, Diabetic Pedigree Function, and Pregnancy. R language was used for the data visualization, while the algorithms considered for the study are Logistic Regression, Support Vector Machines, Decision Trees and Extreme Gradient Boost. The performance analysis of these algorithms on various classification metrics is also presented here, considering the Area Under the Curve and Receiver Operating Characteristics score is the best for Extreme Gradient Boost with 85%, followed by Support Vector Machines and Decision Trees.
The Logistic Regression is showing low performance. But the Decision Trees and Extreme Gradient Boost show promising performance against all the classification metrics. But the Support Vector Machines offers a lower support value; hence it cannot be claimed to be a good classifier. The model showed that the most significant predictors of Type 2 Diabetes Mellitus were strongly correlated with Glucose Levels and mediumly correlated with Body Mass Index, whereas Age, Skin Thickness, Systolic Blood Pressure, Insulin, Pregnancy, and Pedigree Function were less significant. This type of real-time analysis has proved that the symptoms of Type 2 Diabetes Mellitus in women fall entirely different compared to men, which highlights the importance of Glucose Levels and Body Mass Index in women.
The prediction of Type 2 Diabetes Mellitus helps public health professionals to help people by suggesting proper food intake and adjusting lifestyle activities with good fitness management in women to make glucose levels and body mass index controlled. Therefore, the healthcare systems should give special attention to diabetic conditions in women to reduce exacerbations of the disease and other associated symptoms. This work attempts to predict the occurrence of Type 2 Diabetes Mellitus among women on their behavioral and biological conditions.
糖尿病是一种由于血糖水平控制不佳导致的慢性健康状况(长期存在)。本研究使用多种机器学习算法对女性2型糖尿病进行预测,这些算法用于预测糖尿病状况。分析使用了发布在Kaggle上的加利福尼亚大学欧文分校糖尿病数据集。
该数据集包括用于2型糖尿病预测的八个风险因素,包括年龄、收缩压、血糖、体重指数、胰岛素、皮肤厚度、糖尿病家族史函数和妊娠情况。使用R语言进行数据可视化,本研究考虑的算法有逻辑回归、支持向量机、决策树和极端梯度提升。这里还展示了这些算法在各种分类指标上的性能分析,考虑到曲线下面积和受试者工作特征曲线分数,极端梯度提升表现最佳,为85%,其次是支持向量机和决策树。
逻辑回归表现不佳。但决策树和极端梯度提升在所有分类指标上都显示出良好的性能。但支持向量机提供的支持值较低;因此不能称其为一个好的分类器。该模型表明,2型糖尿病最重要的预测因素与血糖水平高度相关,与体重指数中度相关,而年龄、皮肤厚度、收缩压、胰岛素、妊娠情况和家族史函数的相关性较小。这种实时分析证明,女性2型糖尿病的症状与男性完全不同,这突出了血糖水平和体重指数对女性的重要性。
2型糖尿病的预测有助于公共卫生专业人员通过建议适当的食物摄入量和调整女性的生活方式活动并进行良好的健康管理来帮助人们控制血糖水平和体重指数。因此,医疗保健系统应特别关注女性的糖尿病状况,以减少疾病的恶化和其他相关症状。这项工作试图根据女性的行为和生物学状况预测2型糖尿病的发生。