文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

决策树与随机森林在确定2型糖尿病相关危险因素方面的比较

A Comparison between Decision Tree and Random Forest in Determining the Risk Factors Associated with Type 2 Diabetes.

作者信息

Esmaily Habibollah, Tayefi Maryam, Doosti Hassan, Ghayour-Mobarhan Majid, Nezami Hossein, Amirabadizadeh Alireza

机构信息

Social Determinants of Health Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.

Clinical Research Unit, Mashhad university of Medical Sciences, Mashhad, Iran.

出版信息

J Res Health Sci. 2018 Apr 24;18(2):e00412.


DOI:
PMID:29784893
Abstract

BACKGROUND: We aimed to identify the associated risk factors of type 2 diabetes mellitus (T2DM) using data mining approach, decision tree and random forest techniques using the Mashhad Stroke and Heart Atherosclerotic Disorders (MASHAD) Study program. STUDY DESIGN: A cross-sectional study. METHODS: The MASHAD study started in 2010 and will continue until 2020. Two data mining tools, namely decision trees, and random forests, are used for predicting T2DM when some other characteristics are observed on 9528 subjects recruited from MASHAD database. This paper makes a comparison between these two models in terms of accuracy, sensitivity, specificity and the area under ROC curve. RESULTS: The prevalence rate of T2DM was 14% among these subjects. The decision tree model has 64.9% accuracy, 64.5% sensitivity, 66.8% specificity, and area under the ROC curve measuring 68.6%, while the random forest model has 71.1% accuracy, 71.3% sensitivity, 69.9% specificity, and area under the ROC curve measuring 77.3% respectively. CONCLUSIONS: The random forest model, when used with demographic, clinical, and anthropometric and biochemical measurements, can provide a simple tool to identify associated risk factors for type 2 diabetes. Such identification can substantially use for managing the health policy to reduce the number of subjects with T2DM .

摘要

背景:我们旨在利用数据挖掘方法、决策树和随机森林技术,通过马什哈德中风与心脏动脉粥样硬化疾病(MASHAD)研究项目,确定2型糖尿病(T2DM)的相关危险因素。 研究设计:一项横断面研究。 方法:MASHAD研究始于2010年,将持续至2020年。当从MASHAD数据库招募的9528名受试者出现一些其他特征时,使用两种数据挖掘工具,即决策树和随机森林,来预测T2DM。本文在准确性、敏感性、特异性和ROC曲线下面积方面对这两种模型进行了比较。 结果:这些受试者中T2DM的患病率为14%。决策树模型的准确率为64.9%,敏感性为64.5%,特异性为66.8%,ROC曲线下面积为68.6%,而随机森林模型的准确率分别为71.1%,敏感性为71.3%,特异性为69.9%,ROC曲线下面积为77.3%。 结论:随机森林模型与人口统计学、临床、人体测量学和生化测量数据一起使用时,可以提供一个简单的工具来识别2型糖尿病的相关危险因素。这种识别对于制定健康政策以减少T2DM患者数量具有重要意义。

相似文献

[1]
A Comparison between Decision Tree and Random Forest in Determining the Risk Factors Associated with Type 2 Diabetes.

J Res Health Sci. 2018-4-24

[2]
Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining.

Glob J Health Sci. 2015-3-18

[3]
Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran Lipid and Glucose Study.

Diabetes Res Clin Pract. 2014-7-18

[4]
Using Random Forest Models to Identify Correlates of a Diabetic Peripheral Neuropathy Diagnosis from Electronic Health Record Data.

Pain Med. 2017-1-1

[5]
Comparing Three Data Mining Algorithms for Identifying the Associated Risk Factors of Type 2 Diabetes.

Iran Biomed J. 2018-9

[6]
Comparison of three data mining models for predicting diabetes or prediabetes by risk factors.

Kaohsiung J Med Sci. 2012-10-16

[7]
The application of a decision tree to establish the parameters associated with hypertension.

Comput Methods Programs Biomed. 2017-2

[8]
Comparison of two data mining techniques in labeling diagnosis to Iranian pharmacy claim dataset: artificial neural network (ANN) versus decision tree model.

Arch Iran Med. 2014-12

[9]
Risk Assessment of Sarcopenia in Patients With Type 2 Diabetes Mellitus Using Data Mining Methods.

Front Endocrinol (Lausanne). 2020

[10]
A Diagnostic Model for Screening Diabetic Retinopathy Using the Hand-Held Electroretinogram Device RETeval.

Front Endocrinol (Lausanne). 2021

引用本文的文献

[1]
Automated feature learning and survival prognostication in grade 4 glioma using supervised machine learning models.

J Neurooncol. 2025-6-16

[2]
Improved detection of decreased glucose handling capacities via continuous glucose monitoring-derived indices.

Commun Med (Lond). 2025-4-22

[3]
Analysis of factors influencing hookwire dislodgement in CT-guided hookwire localization: a retrospective study using variable importance analysis with a random forest model.

PeerJ. 2025-4-16

[4]
Relationships between multivitamins, blood biochemistry markers, and BMC and BMD based on RF: A cross-sectional and population-based study of NHANES, 2017-2018.

PLoS One. 2025-1-29

[5]
Anthropometric and metabolic parameters associated with visceral fat in non-obese type 2 diabetes individuals.

Diabetol Metab Syndr. 2025-1-22

[6]
Bias in machine learning applications to address non-communicable diseases at a population-level: a scoping review.

BMC Public Health. 2024-12-28

[7]
Predictive accuracy of machine learning models for conservative treatment failure in thoracolumbar burst fractures.

BMC Musculoskelet Disord. 2024-11-18

[8]
Identification of Potential Key Genes for the Comorbidity of Myasthenia Gravis With Thymoma by Integrated Bioinformatics Analysis and Machine Learning.

Bioinform Biol Insights. 2024-9-26

[9]
Uric acid is associated with type 2 diabetes: data mining approaches.

Diabetol Int. 2024-4-16

[10]
Machine learning identification of edible vegetable oils from fatty acid compositions and hyperspectral images.

Curr Res Food Sci. 2024-4-20

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索