• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分析不同糖尿病数据集上用于糖尿病预测的分类和特征选择策略。

Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets.

作者信息

Kaliappan Jayakumar, Saravana Kumar I J, Sundaravelan S, Anesh T, Rithik R R, Singh Yashbir, Vera-Garcia Diana V, Himeur Yassine, Mansoor Wathiq, Atalla Shadi, Srinivasan Kathiravan

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.

Radiology, Mayo Clinic, Rochester, MN, United States.

出版信息

Front Artif Intell. 2024 Aug 21;7:1421751. doi: 10.3389/frai.2024.1421751. eCollection 2024.

DOI:10.3389/frai.2024.1421751
PMID:39233892
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11371799/
Abstract

INTRODUCTION

In the evolving landscape of healthcare and medicine, the merging of extensive medical datasets with the powerful capabilities of machine learning (ML) models presents a significant opportunity for transforming diagnostics, treatments, and patient care.

METHODS

This research paper delves into the realm of data-driven healthcare, placing a special focus on identifying the most effective ML models for diabetes prediction and uncovering the critical features that aid in this prediction. The prediction performance is analyzed using a variety of ML models, such as Random Forest (RF), XG Boost (XGB), Linear Regression (LR), Gradient Boosting (GB), and Support VectorMachine (SVM), across numerousmedical datasets. The study of feature importance is conducted using methods including Filter-based, Wrapper-based techniques, and Explainable Artificial Intelligence (Explainable AI). By utilizing Explainable AI techniques, specifically Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), the decision-making process of the models is ensured to be transparent, thereby bolstering trust in AI-driven decisions.

RESULTS

Features identified by RF in Wrapper-based techniques and the Chi-square in Filter-based techniques have been shown to enhance prediction performance. A notable precision and recall values, reaching up to 0.9 is achieved in predicting diabetes.

DISCUSSION

Both approaches are found to assign considerable importance to features like age, family history of diabetes, polyuria, polydipsia, and high blood pressure, which are strongly associated with diabetes. In this age of data-driven healthcare, the research presented here aspires to substantially improve healthcare outcomes.

摘要

引言

在不断发展的医疗保健和医学领域,将大量医学数据集与机器学习(ML)模型的强大功能相结合,为变革诊断、治疗和患者护理提供了重大机遇。

方法

本研究论文深入探讨了数据驱动的医疗保健领域,特别关注识别用于糖尿病预测的最有效ML模型,并揭示有助于这种预测的关键特征。使用多种ML模型,如随机森林(RF)、XGBoost(XGB)、线性回归(LR)、梯度提升(GB)和支持向量机(SVM),对众多医学数据集的预测性能进行分析。使用基于过滤器、基于包装器的技术和可解释人工智能(可解释AI)等方法进行特征重要性研究。通过利用可解释AI技术,特别是局部可解释模型无关解释(LIME)和夏普利加法解释(SHAP),确保模型的决策过程透明,从而增强对人工智能驱动决策的信任。

结果

基于包装器技术的RF和基于过滤器技术的卡方检验所识别的特征已被证明可提高预测性能。在预测糖尿病方面实现了显著的精度和召回率值,高达0.9。

讨论

两种方法都发现对年龄、糖尿病家族史、多尿、多饮和高血压等与糖尿病密切相关的特征赋予了相当大的重要性。在这个数据驱动的医疗保健时代,本文提出的研究旨在大幅改善医疗保健结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/2d9ab634d5f8/frai-07-1421751-g0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/26fe35bedacd/frai-07-1421751-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/70913131500f/frai-07-1421751-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/cc911f53e543/frai-07-1421751-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/3ddfc7831b24/frai-07-1421751-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/7f1b04aa18dc/frai-07-1421751-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/69e5e91a5601/frai-07-1421751-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/f4e72f50e977/frai-07-1421751-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/a9e9ea933ba5/frai-07-1421751-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/1e8c92873d53/frai-07-1421751-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/a98db9a1c65c/frai-07-1421751-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/4ea4d18ea7e6/frai-07-1421751-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/82e8d02596a8/frai-07-1421751-g0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/43e710531338/frai-07-1421751-g0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/62f255f50d91/frai-07-1421751-g0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/2d9ab634d5f8/frai-07-1421751-g0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/26fe35bedacd/frai-07-1421751-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/70913131500f/frai-07-1421751-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/cc911f53e543/frai-07-1421751-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/3ddfc7831b24/frai-07-1421751-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/7f1b04aa18dc/frai-07-1421751-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/69e5e91a5601/frai-07-1421751-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/f4e72f50e977/frai-07-1421751-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/a9e9ea933ba5/frai-07-1421751-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/1e8c92873d53/frai-07-1421751-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/a98db9a1c65c/frai-07-1421751-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/4ea4d18ea7e6/frai-07-1421751-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/82e8d02596a8/frai-07-1421751-g0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/43e710531338/frai-07-1421751-g0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/62f255f50d91/frai-07-1421751-g0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2863/11371799/2d9ab634d5f8/frai-07-1421751-g0015.jpg

相似文献

1
Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets.分析不同糖尿病数据集上用于糖尿病预测的分类和特征选择策略。
Front Artif Intell. 2024 Aug 21;7:1421751. doi: 10.3389/frai.2024.1421751. eCollection 2024.
2
Beyond black-box models: explainable AI for embryo ploidy prediction and patient-centric consultation.超越黑箱模型:用于胚胎倍性预测和以患者为中心咨询的可解释人工智能
J Assist Reprod Genet. 2024 Sep;41(9):2349-2358. doi: 10.1007/s10815-024-03178-7. Epub 2024 Jul 4.
3
Responsible AI for cardiovascular disease detection: Towards a privacy-preserving and interpretable model.心血管疾病检测的负责任 AI:迈向隐私保护和可解释的模型。
Comput Methods Programs Biomed. 2024 Sep;254:108289. doi: 10.1016/j.cmpb.2024.108289. Epub 2024 Jun 17.
4
Explainable Machine Learning to Predict Successful Weaning Among Patients Requiring Prolonged Mechanical Ventilation: A Retrospective Cohort Study in Central Taiwan.可解释机器学习用于预测需要长期机械通气患者的成功撤机:台湾中部的一项回顾性队列研究
Front Med (Lausanne). 2021 Apr 23;8:663739. doi: 10.3389/fmed.2021.663739. eCollection 2021.
5
Explainable artificial intelligence model for identifying COVID-19 gene biomarkers.用于识别 COVID-19 基因生物标志物的可解释人工智能模型。
Comput Biol Med. 2023 Mar;154:106619. doi: 10.1016/j.compbiomed.2023.106619. Epub 2023 Feb 1.
6
Explainable artificial intelligence for LDL cholesterol prediction and classification.用于 LDL 胆固醇预测和分类的可解释人工智能。
Clin Biochem. 2024 Aug;130:110791. doi: 10.1016/j.clinbiochem.2024.110791. Epub 2024 Jul 6.
7
Explainable machine learning models based on multimodal time-series data for the early detection of Parkinson's disease.基于多模态时间序列数据的可解释机器学习模型用于帕金森病的早期检测。
Comput Methods Programs Biomed. 2023 Jun;234:107495. doi: 10.1016/j.cmpb.2023.107495. Epub 2023 Mar 23.
8
An Explainable Artificial Intelligence Framework for the Deterioration Risk Prediction of Hepatitis Patients.用于预测肝炎患者恶化风险的可解释人工智能框架。
J Med Syst. 2021 Apr 13;45(5):61. doi: 10.1007/s10916-021-01736-5.
9
IHCP: interpretable hepatitis C prediction system based on black-box machine learning models.IHCP:基于黑盒机器学习模型的可解释丙型肝炎预测系统。
BMC Bioinformatics. 2023 Sep 6;24(1):333. doi: 10.1186/s12859-023-05456-0.
10
Machine Learning Models for Predicting Influential Factors of Early Outcomes in Acute Ischemic Stroke: Registry-Based Study.用于预测急性缺血性卒中早期预后影响因素的机器学习模型:基于登记处的研究
JMIR Med Inform. 2022 Mar 25;10(3):e32508. doi: 10.2196/32508.

引用本文的文献

1
Data-driven diabetes mellitus prediction and management: a comparative evaluation of decision tree classifier and artificial neural network models along with statistical analysis.数据驱动的糖尿病预测与管理:决策树分类器和人工神经网络模型的比较评估及统计分析
Sci Rep. 2025 Jun 2;15(1):19339. doi: 10.1038/s41598-025-03718-w.
2
Energy landscape analysis of health checkup data clarified multiple pathways to diabetes development in obese and non-obese subjects.健康检查数据的能量景观分析阐明了肥胖和非肥胖受试者患糖尿病的多种途径。
Front Endocrinol (Lausanne). 2025 May 6;16:1576431. doi: 10.3389/fendo.2025.1576431. eCollection 2025.

本文引用的文献

1
Pediatric diabetes prediction using deep learning.基于深度学习的儿科糖尿病预测。
Sci Rep. 2024 Feb 20;14(1):4206. doi: 10.1038/s41598-024-51438-4.
2
Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2.用于 2 型糖尿病早期预测和严重程度的混合特征选择和分类技术。
PLoS One. 2024 Jan 18;19(1):e0292100. doi: 10.1371/journal.pone.0292100. eCollection 2024.
3
An ensemble learning approach for diabetes prediction using boosting techniques.一种使用提升技术进行糖尿病预测的集成学习方法。
Front Genet. 2023 Oct 26;14:1252159. doi: 10.3389/fgene.2023.1252159. eCollection 2023.
4
Prediction of diabetes disease using an ensemble of machine learning multi-classifier models.使用机器学习多分类器集成模型预测糖尿病疾病。
BMC Bioinformatics. 2023 Sep 12;24(1):337. doi: 10.1186/s12859-023-05465-z.
5
A diabetes prediction model based on Boruta feature selection and ensemble learning.基于 Boruta 特征选择和集成学习的糖尿病预测模型。
BMC Bioinformatics. 2023 Jun 1;24(1):224. doi: 10.1186/s12859-023-05300-5.
6
Diabetes prediction using machine learning and explainable AI techniques.使用机器学习和可解释人工智能技术进行糖尿病预测。
Healthc Technol Lett. 2022 Dec 14;10(1-2):1-10. doi: 10.1049/htl2.12039. eCollection 2023 Feb-Apr.
7
A hybrid super ensemble learning model for the early-stage prediction of diabetes risk.一种用于糖尿病风险早期预测的混合超级集成学习模型。
Med Biol Eng Comput. 2023 Mar;61(3):785-797. doi: 10.1007/s11517-022-02749-z. Epub 2023 Jan 5.
8
General aspects of diabetes mellitus.糖尿病的一般情况。
Handb Clin Neurol. 2014;126:211-22. doi: 10.1016/B978-0-444-53480-4.00015-1.
9
Diabetic patients: epidemiology and global impact.糖尿病患者:流行病学与全球影响。
J Cardiovasc Surg (Torino). 2009 Jun;50(3):263-73.
10
Early diagnosis and prevention of diabetes in developing countries.发展中国家糖尿病的早期诊断与预防。
Rev Endocr Metab Disord. 2008 Sep;9(3):193-201. doi: 10.1007/s11154-008-9079-z. Epub 2008 Jul 7.