Suppr超能文献

使用机器学习和可解释人工智能技术进行糖尿病预测。

Diabetes prediction using machine learning and explainable AI techniques.

作者信息

Tasin Isfafuzzaman, Nabil Tansin Ullah, Islam Sanjida, Khan Riasat

机构信息

Electrical and Computer Engineering North South University Dhaka Bangladesh.

出版信息

Healthc Technol Lett. 2022 Dec 14;10(1-2):1-10. doi: 10.1049/htl2.12039. eCollection 2023 Feb-Apr.

Abstract

Globally, diabetes affects 537 million people, making it the deadliest and the most common non-communicable disease. Many factors can cause a person to get affected by diabetes, like excessive body weight, abnormal cholesterol level, family history, physical inactivity, bad food habit etc. Increased urination is one of the most common symptoms of this disease. People with diabetes for a long time can get several complications like heart disorder, kidney disease, nerve damage, diabetic retinopathy etc. But its risk can be reduced if it is predicted early. In this paper, an automatic diabetes prediction system has been developed using a private dataset of female patients in Bangladesh and various machine learning techniques. The authors used the Pima Indian diabetes dataset and collected additional samples from 203 individuals from a local textile factory in Bangladesh. Feature selection algorithm mutual information has been applied in this work. A semi-supervised model with extreme gradient boosting has been utilized to predict the insulin features of the private dataset. SMOTE and ADASYN approaches have been employed to manage the class imbalance problem. The authors used machine learning classification methods, that is, decision tree, SVM, Random Forest, Logistic Regression, KNN, and various ensemble techniques, to determine which algorithm produces the best prediction results. After training on and testing all the classification models, the proposed system provided the best result in the XGBoost classifier with the ADASYN approach with 81% accuracy, 0.81 F1 coefficient and AUC of 0.84. Furthermore, the domain adaptation method has been implemented to demonstrate the versatility of the proposed system. The explainable AI approach with LIME and SHAP frameworks is implemented to understand how the model predicts the final results. Finally, a website framework and an Android smartphone application have been developed to input various features and predict diabetes instantaneously. The private dataset of female Bangladeshi patients and programming codes are available at the following link: https://github.com/tansin-nabil/Diabetes-Prediction-Using-Machine-Learning.

摘要

在全球范围内,糖尿病影响着5.37亿人,使其成为最致命且最常见的非传染性疾病。许多因素可导致人们患上糖尿病,如体重超标、胆固醇水平异常、家族病史、缺乏身体活动、不良饮食习惯等。多尿是这种疾病最常见的症状之一。长期患糖尿病的人会出现多种并发症,如心脏病、肾病、神经损伤、糖尿病视网膜病变等。但如果能早期预测,其风险是可以降低的。在本文中,利用孟加拉国女性患者的一个私有数据集和各种机器学习技术,开发了一个自动糖尿病预测系统。作者使用了皮马印第安人糖尿病数据集,并从孟加拉国当地一家纺织厂的203名个体中收集了额外样本。在这项工作中应用了特征选择算法互信息。利用带有极端梯度提升的半监督模型来预测私有数据集的胰岛素特征。采用了SMOTE和ADASYN方法来处理类别不平衡问题。作者使用机器学习分类方法,即决策树、支持向量机、随机森林、逻辑回归、K近邻,以及各种集成技术,来确定哪种算法能产生最佳预测结果。在对所有分类模型进行训练和测试后,所提出的系统在采用ADASYN方法的XGBoost分类器中取得了最佳结果,准确率为81%,F1系数为0.81,AUC为0.84。此外,还实施了域适应方法以证明所提出系统的通用性。采用带有LIME和SHAP框架的可解释人工智能方法来理解模型如何预测最终结果。最后,开发了一个网站框架和一个安卓智能手机应用程序,用于输入各种特征并即时预测糖尿病。孟加拉国女性患者的私有数据集和编程代码可在以下链接获取:https://github.com/tansin-nabil/Diabetes-Prediction-Using-Machine-Learning

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a37/10107388/745f178628d2/HTL2-10-1-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验