Ganie Shahid Mohammad, Pramanik Pijush Kanti Dutta, Bashir Malik Majid, Mallik Saurav, Qin Hong
AI Research Centre, School of Business, Woxsen University, Hyderabad, India.
School of Computer Applications and Technology, Galgotias University, Greater Noida, India.
Front Genet. 2023 Oct 26;14:1252159. doi: 10.3389/fgene.2023.1252159. eCollection 2023.
Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years. To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics. The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model. The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.
糖尿病被认为是影响全球数百万人的主要医疗保健问题之一。在疾病的最早阶段采取适当行动取决于早期糖尿病预测和识别。为了支持医疗保健提供者更好地诊断和预测疾病,近年来医疗行业一直在探索机器学习。为了预测糖尿病,本研究在皮马糖尿病数据集上对五种提升算法进行了实验。该数据集来自加利福尼亚大学欧文分校(UCI)机器学习库,其中包含几个重要的临床特征。探索性数据分析用于识别数据集的特征。此外,采用上采样、归一化、特征选择和超参数调整进行预测分析。使用各种统计/机器学习指标和k折交叉验证技术对结果进行分析。在所有分类器中,梯度提升的准确率最高,达到92.85%。精确率、召回率、F1分数和受试者工作特征(ROC)曲线用于进一步验证模型。所提出的模型在预测准确性方面优于当前研究,证明了其对具有相似预测指标的其他疾病的适用性。
Front Genet. 2023-10-26
BMC Bioinformatics. 2023-9-12
J Diabetes Metab Disord. 2023-11-22
BMC Med Inform Decis Mak. 2019-11-6
Comput Intell Neurosci. 2022
Curr Med Imaging. 2023-5-8
Front Psychol. 2025-7-23
Can J Public Health. 2025-6-11
Bioengineering (Basel). 2025-4-29
Biosensors (Basel). 2025-3-1
Front Artif Intell. 2025-1-7
Bioengineering (Basel). 2024-11-30
BMC Med Res Methodol. 2024-9-27
Front Artif Intell. 2024-8-21
J Diabetes Metab Disord. 2022-3-14
J Healthc Inform Res. 2020-5-7
Comput Ind Eng. 2022-3
J Big Data. 2020
BMC Endocr Disord. 2019-10-15
World J Diabetes. 2015-6-25
Diabetes Res Clin Pract. 2013-9