• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

糖尿病数据分类的比较方法:机器学习范例。

Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm.

机构信息

Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh.

Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh.

出版信息

Comput Methods Programs Biomed. 2017 Dec;152:23-34. doi: 10.1016/j.cmpb.2017.09.004. Epub 2017 Sep 8.

DOI:10.1016/j.cmpb.2017.09.004
PMID:29054258
Abstract

BACKGROUND AND OBJECTIVE

Diabetes is a silent killer. The main cause of this disease is the presence of excessive amounts of metabolites such as glucose. There were about 387 million diabetic people all over the world in 2014. The financial burden of this disease has been calculated to be about $13,700 per year. According to the World Health Organization (WHO), these figures will more than double by the year 2030. This cost will be reduced dramatically if someone can predict diabetes statistically on the basis of some covariates. Although several classification techniques are available, it is very difficult to classify diabetes. The main objectives of this paper are as follows: (i) Gaussian process classification (GPC), (ii) comparative classifier for diabetes data classification, (iii) data analysis using the cross-validation approach, (iv) interpretation of the data analysis and (v) benchmarking our method against others.

METHODS

To classify diabetes, several classification techniques are used such as linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and Naive Bayes (NB). However, most of the medical data show non-normality, non-linearity and inherent correlation structure. So in this paper we adapted Gaussian process (GP)-based classification technique using three kernels namely: linear, polynomial and radial basis kernel. We also investigate the performance of a GP-based classification technique in comparison to existing techniques such as LDA, QDA and NB. Performances are evaluated by using the accuracy (ACC), sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV) and receiver-operating characteristic (ROC) curves.

RESULTS

Pima Indian diabetes dataset is taken as part of the study. This consists of 768 patients, of which 268 patients are diabetic and 500 patients are controls. Our machine learning system shows the performance of GP-based model as: ACC 81.97%, SE 91.79%, SP 63.33%, PPV 84.91% and NPV 62.50% which are larger compared to other methods.

摘要

背景与目的

糖尿病是一种无声的杀手。这种疾病的主要原因是存在过多的代谢物,如葡萄糖。2014 年,全球约有 3.87 亿糖尿病患者。据世界卫生组织(WHO)统计,到 2030 年,这一数字将翻一番以上。如果有人能够根据一些协变量从统计学上预测糖尿病,那么这种疾病的经济负担将大大降低。虽然有几种分类技术,但糖尿病的分类非常困难。本文的主要目的如下:(i)高斯过程分类(GPC),(ii)糖尿病数据分类的比较分类器,(iii)使用交叉验证方法进行数据分析,(iv)数据分析的解释,(v)将我们的方法与其他方法进行基准测试。

方法

为了对糖尿病进行分类,使用了几种分类技术,如线性判别分析(LDA)、二次判别分析(QDA)和朴素贝叶斯(NB)。然而,大多数医学数据表现出非正态性、非线性和固有相关性结构。因此,在本文中,我们采用了基于高斯过程(GP)的分类技术,使用了三种核函数:线性、多项式和径向基核函数。我们还研究了基于 GP 的分类技术与 LDA、QDA 和 NB 等现有技术的性能比较。使用准确性(ACC)、敏感性(SE)、特异性(SP)、阳性预测值(PPV)、阴性预测值(NPV)和接收者操作特征(ROC)曲线来评估性能。

结果

以 Pima 印第安人糖尿病数据集为研究的一部分。该数据集包含 768 名患者,其中 268 名患者患有糖尿病,500 名患者为对照组。我们的机器学习系统显示,基于 GP 的模型的性能为:ACC 81.97%,SE 91.79%,SP 63.33%,PPV 84.91%和 NPV 62.50%,均优于其他方法。

相似文献

1
Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm.糖尿病数据分类的比较方法:机器学习范例。
Comput Methods Programs Biomed. 2017 Dec;152:23-34. doi: 10.1016/j.cmpb.2017.09.004. Epub 2017 Sep 8.
2
Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers.利用机器学习进行准确的糖尿病风险分层:缺失值和异常值的作用。
J Med Syst. 2018 Apr 10;42(5):92. doi: 10.1007/s10916-018-0940-7.
3
Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms.使用多种机器学习范例对结肠微阵列基因表达数据进行统计特征描述和分类。
Comput Methods Programs Biomed. 2019 Jul;176:173-193. doi: 10.1016/j.cmpb.2019.04.008. Epub 2019 Apr 10.
4
Comparative Analysis of Classification Methods with PCA and LDA for Diabetes.用于糖尿病的主成分分析(PCA)和线性判别分析(LDA)分类方法的比较分析
Curr Diabetes Rev. 2020;16(8):833-850. doi: 10.2174/1573399816666200123124008.
5
Gaussian process-based kernel as a diagnostic model for prediction of type 2 diabetes mellitus risk using non-linear heart rate variability features.基于高斯过程的核作为一种诊断模型,用于利用非线性心率变异性特征预测2型糖尿病风险。
Biomed Eng Lett. 2021 Jun 25;11(3):273-286. doi: 10.1007/s13534-021-00196-7. eCollection 2021 Aug.
6
Machine Learning for the Prediction of New-Onset Diabetes Mellitus during 5-Year Follow-up in Non-Diabetic Patients with Cardiovascular Risks.机器学习用于预测有心血管风险的非糖尿病患者5年随访期间新发糖尿病
Yonsei Med J. 2019 Feb;60(2):191-199. doi: 10.3349/ymj.2019.60.2.191.
7
Diabetes disease detection and classification on Indian demographic and health survey data using machine learning methods.使用机器学习方法对印度人口与健康调查数据进行糖尿病疾病检测与分类
Diabetes Metab Syndr. 2023 Jan;17(1):102690. doi: 10.1016/j.dsx.2022.102690. Epub 2022 Dec 5.
8
A soft computing approach for diabetes disease classification.基于软计算的糖尿病疾病分类方法。
Health Informatics J. 2018 Dec;24(4):379-393. doi: 10.1177/1460458216675500. Epub 2016 Nov 14.
9
Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies.基于特征提取策略组合的机器学习技术在前列腺癌检测中的应用
Cancer Biomark. 2018 Feb 6;21(2):393-413. doi: 10.3233/CBM-170643.
10
Prediction of delayed graft function after kidney transplantation: comparison between logistic regression and machine learning methods.肾移植后移植肾功能延迟的预测:逻辑回归与机器学习方法的比较
BMC Med Inform Decis Mak. 2015 Oct 14;15:83. doi: 10.1186/s12911-015-0206-y.

引用本文的文献

1
Synthesis and Biological Evaluation of Quinolone-Based Hydrazones as Potential Antidiabetic Agents Targeting Key Metabolic Enzymes.基于喹诺酮的腙类化合物作为靶向关键代谢酶的潜在抗糖尿病药物的合成及生物学评价
ACS Omega. 2025 Jul 22;10(30):33712-33730. doi: 10.1021/acsomega.5c04663. eCollection 2025 Aug 5.
2
Early prediction of postpartum dyslipidemia in gestational diabetes using machine learning models.使用机器学习模型对妊娠期糖尿病患者产后血脂异常进行早期预测。
Sci Rep. 2025 Mar 7;15(1):8028. doi: 10.1038/s41598-025-92299-9.
3
Machine learning and spatio-temporal analysis of meteorological factors on waterborne diseases in Bangladesh.
孟加拉国水源性疾病气象因素的机器学习与时空分析
PLoS Negl Trop Dis. 2025 Jan 16;19(1):e0012800. doi: 10.1371/journal.pntd.0012800. eCollection 2025 Jan.
4
Predictive models and determinants of mortality among T2DM patients in a tertiary hospital in Ghana, how do machine learning techniques perform?加纳一家三级医院中2型糖尿病患者的死亡率预测模型及决定因素,机器学习技术表现如何?
BMC Endocr Disord. 2025 Jan 10;25(1):9. doi: 10.1186/s12902-025-01831-5.
5
Methodology for Safe and Secure AI in Diabetes Management.糖尿病管理中安全可靠人工智能的方法学。
J Diabetes Sci Technol. 2025 May;19(3):620-627. doi: 10.1177/19322968241304434. Epub 2024 Dec 26.
6
Robust diabetic prediction using ensemble machine learning models with synthetic minority over-sampling technique.基于集成机器学习模型和合成少数过采样技术的稳健糖尿病预测。
Sci Rep. 2024 Nov 22;14(1):28984. doi: 10.1038/s41598-024-78519-8.
7
Decentralized and Secure Collaborative Framework for Personalized Diabetes Prediction.用于个性化糖尿病预测的去中心化安全协作框架
Biomedicines. 2024 Aug 21;12(8):1916. doi: 10.3390/biomedicines12081916.
8
Machine Learning Meets Meta-Heuristics: Bald Eagle Search Optimization and Red Deer Optimization for Feature Selection in Type II Diabetes Diagnosis.机器学习与元启发式算法相结合:用于II型糖尿病诊断特征选择的白头鹰搜索优化算法和马鹿优化算法
Bioengineering (Basel). 2024 Jul 29;11(8):766. doi: 10.3390/bioengineering11080766.
9
Toward reliable diabetes prediction: Innovations in data engineering and machine learning applications.迈向可靠的糖尿病预测:数据工程与机器学习应用的创新
Digit Health. 2024 Aug 21;10:20552076241271867. doi: 10.1177/20552076241271867. eCollection 2024 Jan-Dec.
10
Advanced CKD detection through optimized metaheuristic modeling in healthcare informatics.通过医疗信息学中的优化元启发式建模进行先进的慢性肾脏病检测。
Sci Rep. 2024 Jun 1;14(1):12601. doi: 10.1038/s41598-024-63292-5.