Suppr超能文献

迈向可靠的糖尿病预测:数据工程与机器学习应用的创新

Toward reliable diabetes prediction: Innovations in data engineering and machine learning applications.

作者信息

Talukder Md Alamin, Islam Md Manowarul, Uddin Md Ashraf, Kazi Mohsin, Khalid Majdi, Akhter Arnisha, Ali Moni Mohammad

机构信息

Department of Computer Science and Engineering, International University of Business Agriculture and Technology, Dhaka, Bangladesh.

Department of Computer Science and Engineering, Jagannath University, Dhaka, Bangladesh.

出版信息

Digit Health. 2024 Aug 21;10:20552076241271867. doi: 10.1177/20552076241271867. eCollection 2024 Jan-Dec.

Abstract

OBJECTIVE

Diabetes is a metabolic disorder that causes the risk of stroke, heart disease, kidney failure, and other long-term complications because diabetes generates excess sugar in the blood. Machine learning (ML) models can aid in diagnosing diabetes at the primary stage. So, we need an efficient ML model to diagnose diabetes accurately.

METHODS

In this paper, an effective data preprocessing pipeline has been implemented to process the data and random oversampling to balance the data, handling the imbalance distributions of the observational data more sophisticatedly. We used four different diabetes datasets to conduct our experiments. Several ML algorithms were used to determine the best models to predict diabetes faultlessly.

RESULTS

The performance analysis demonstrates that among all ML algorithms, random forest surpasses the current works with an accuracy rate of 86% and 98.48% for Dataset 1 and Dataset 2; extreme gradient boosting and decision tree surpass with an accuracy rate of 99.27% and 100% for Dataset 3 and Dataset 4, respectively. Our proposal can increase accuracy by 12.15% compared to the model without preprocessing.

CONCLUSIONS

This excellent research finding indicates that the proposed models might be employed to produce more accurate diabetes predictions to supplement current preventative interventions to reduce the incidence of diabetes and its associated costs.

摘要

目的

糖尿病是一种代谢紊乱疾病,由于血液中产生过多糖分,会导致中风、心脏病、肾衰竭及其他长期并发症的风险。机器学习(ML)模型有助于在初级阶段诊断糖尿病。因此,我们需要一个高效的ML模型来准确诊断糖尿病。

方法

本文实施了一个有效的数据预处理管道来处理数据,并采用随机过采样来平衡数据,更精细地处理观测数据的不平衡分布。我们使用四个不同的糖尿病数据集进行实验。使用了几种ML算法来确定能够完美预测糖尿病的最佳模型。

结果

性能分析表明,在所有ML算法中,对于数据集1和数据集2,随机森林的准确率分别为86%和98.48%,超过了当前的研究成果;对于数据集3和数据集4,极端梯度提升和决策树的准确率分别为99.27%和100%,超过了其他算法。与未进行预处理的模型相比,我们提出的方法准确率可提高12.15%。

结论

这一出色的研究结果表明,所提出的模型可用于做出更准确的糖尿病预测,以补充当前的预防干预措施,从而降低糖尿病的发病率及其相关成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fc5/11339751/b44e339ce9ec/10.1177_20552076241271867-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验