文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

一种使用提升技术进行糖尿病预测的集成学习方法。

An ensemble learning approach for diabetes prediction using boosting techniques.

作者信息

Ganie Shahid Mohammad, Pramanik Pijush Kanti Dutta, Bashir Malik Majid, Mallik Saurav, Qin Hong

机构信息

AI Research Centre, School of Business, Woxsen University, Hyderabad, India.

School of Computer Applications and Technology, Galgotias University, Greater Noida, India.

出版信息

Front Genet. 2023 Oct 26;14:1252159. doi: 10.3389/fgene.2023.1252159. eCollection 2023.


DOI:10.3389/fgene.2023.1252159
PMID:37953921
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10639159/
Abstract

Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years. To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics. The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model. The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.

摘要

糖尿病被认为是影响全球数百万人的主要医疗保健问题之一。在疾病的最早阶段采取适当行动取决于早期糖尿病预测和识别。为了支持医疗保健提供者更好地诊断和预测疾病,近年来医疗行业一直在探索机器学习。为了预测糖尿病,本研究在皮马糖尿病数据集上对五种提升算法进行了实验。该数据集来自加利福尼亚大学欧文分校(UCI)机器学习库,其中包含几个重要的临床特征。探索性数据分析用于识别数据集的特征。此外,采用上采样、归一化、特征选择和超参数调整进行预测分析。使用各种统计/机器学习指标和k折交叉验证技术对结果进行分析。在所有分类器中,梯度提升的准确率最高,达到92.85%。精确率、召回率、F1分数和受试者工作特征(ROC)曲线用于进一步验证模型。所提出的模型在预测准确性方面优于当前研究,证明了其对具有相似预测指标的其他疾病的适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/42e1199f76b0/fgene-14-1252159-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/a19ce6cffbb4/fgene-14-1252159-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/7299f2101f57/fgene-14-1252159-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/4681d23354da/fgene-14-1252159-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/d535f3bf14ea/fgene-14-1252159-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/cd78d533d93e/fgene-14-1252159-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/e8478b12b46d/fgene-14-1252159-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/cb4f0e81a8e4/fgene-14-1252159-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/70ed29ef3349/fgene-14-1252159-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/da38a4baba7a/fgene-14-1252159-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/16b34e53e57a/fgene-14-1252159-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/010c401afb71/fgene-14-1252159-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/e6a136375565/fgene-14-1252159-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/42e1199f76b0/fgene-14-1252159-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/a19ce6cffbb4/fgene-14-1252159-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/7299f2101f57/fgene-14-1252159-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/4681d23354da/fgene-14-1252159-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/d535f3bf14ea/fgene-14-1252159-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/cd78d533d93e/fgene-14-1252159-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/e8478b12b46d/fgene-14-1252159-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/cb4f0e81a8e4/fgene-14-1252159-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/70ed29ef3349/fgene-14-1252159-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/da38a4baba7a/fgene-14-1252159-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/16b34e53e57a/fgene-14-1252159-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/010c401afb71/fgene-14-1252159-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/e6a136375565/fgene-14-1252159-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dd6/10639159/42e1199f76b0/fgene-14-1252159-g013.jpg

相似文献

[1]
An ensemble learning approach for diabetes prediction using boosting techniques.

Front Genet. 2023-10-26

[2]
Chronic kidney disease prediction using boosting techniques based on clinical parameters.

PLoS One. 2023

[3]
Prediction of diabetes disease using an ensemble of machine learning multi-classifier models.

BMC Bioinformatics. 2023-9-12

[4]
Interpretable prediction of acute respiratory infection disease among under-five children in Ethiopia using ensemble machine learning and Shapley additive explanations (SHAP).

Digit Health. 2024-8-6

[5]
A stacked ensemble machine learning approach for the prediction of diabetes.

J Diabetes Metab Disord. 2023-11-22

[6]
A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.

BMC Med Inform Decis Mak. 2019-11-6

[7]
Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm.

Comput Intell Neurosci. 2022

[8]
An efficient ensemble based machine learning approach for predicting Chronic Kidney Disease.

Curr Med Imaging. 2023-5-8

[9]
Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research.

Diagnostics (Basel). 2023-10-26

[10]
Ensemble Machine Learning of Gradient Boosting (XGBoost, LightGBM, CatBoost) and Attention-Based CNN-LSTM for Harmful Algal Blooms Forecasting.

Toxins (Basel). 2023-10-10

引用本文的文献

[1]
A hybrid approach for pattern recognition and interpretation in age-related false memory.

Front Psychol. 2025-7-23

[2]
Machine learning used to study risk factors for chronic diseases: A scoping review.

Can J Public Health. 2025-6-11

[3]
Enhanced and Interpretable Prediction of Multiple Cancer Types Using a Stacking Ensemble Approach with SHAP Analysis.

Bioengineering (Basel). 2025-4-29

[4]
Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets.

Sci Rep. 2025-4-22

[5]
Machine Learning-Driven D-Glucose Prediction Using a Novel Biosensor for Non-Invasive Diabetes Management.

Biosensors (Basel). 2025-3-1

[6]
Predicting total healthcare demand using machine learning: separate and combined analysis of predisposing, enabling, and need factors.

BMC Health Serv Res. 2025-3-12

[7]
Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets.

Front Artif Intell. 2025-1-7

[8]
Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction.

Bioengineering (Basel). 2024-11-30

[9]
An automated approach to predict diabetic patients using KNN imputation and effective data mining techniques.

BMC Med Res Methodol. 2024-9-27

[10]
Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets.

Front Artif Intell. 2024-8-21

本文引用的文献

[1]
Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches.

J Diabetes Metab Disord. 2022-3-14

[2]
Imputation of Missing Data in Electronic Health Records Based on Patients' Similarities.

J Healthc Inform Res. 2020-5-7

[3]
An empowered AdaBoost algorithm implementation: A COVID-19 dataset study.

Comput Ind Eng. 2022-3

[4]
CatBoost for big data: an interdisciplinary review.

J Big Data. 2020

[5]
Predictive models for diabetes mellitus using machine learning techniques.

BMC Endocr Disord. 2019-10-15

[6]
Diabetes mellitus: The epidemic of the century.

World J Diabetes. 2015-6-25

[7]
International Diabetes Federation.

Diabetes Res Clin Pract. 2013-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索