应用决策树来确定与高血压相关的参数。

The application of a decision tree to establish the parameters associated with hypertension.

作者信息

Tayefi Maryam, Esmaeili Habibollah, Saberi Karimian Maryam, Amirabadi Zadeh Alireza, Ebrahimi Mahmoud, Safarian Mohammad, Nematy Mohsen, Parizadeh Seyed Mohammad Reza, Ferns Gordon A, Ghayour-Mobarhan Majid

机构信息

Biochemistry and Nutrition Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Biochemistry and Nutrition Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran; Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran.

出版信息

Comput Methods Programs Biomed. 2017 Feb;139:83-91. doi: 10.1016/j.cmpb.2016.10.020. Epub 2016 Oct 24.

Abstract

INTRODUCTION

Hypertension is an important risk factor for cardiovascular disease (CVD). The goal of this study was to establish the factors associated with hypertension by using a decision-tree algorithm as a supervised classification method of data mining.

METHODS

Data from a cross-sectional study were used in this study. A total of 9078 subjects who met the inclusion criteria were recruited. 70% of these subjects (6358 cases) were randomly allocated to the training dataset for the constructing of the decision-tree. The remaining 30% (2720 cases) were used as the testing dataset to evaluate the performance of decision-tree. Two models were evaluated in this study. In model I, age, gender, body mass index, marital status, level of education, occupation status, depression and anxiety status, physical activity level, smoking status, LDL, TG, TC, FBG, uric acid and hs-CRP were considered as input variables and in model II, age, gender, WBC, RBC, HGB, HCT MCV, MCH, PLT, RDW and PDW were considered as input variables. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve.

RESULTS

The prevalence rates of hypertension were 32% in our population. For the decision-tree model I, the accuracy, sensitivity, specificity and area under the ROC curve (AUC) value for identifying the related risk factors of hypertension were 73%, 63%, 77% and 0.72, respectively. The corresponding values for model II were 70%, 61%, 74% and 0.68, respectively.

CONCLUSION

We have developed a decision tree model to identify the risk factors associated with hypertension that maybe used to develop programs for hypertension management.

摘要

引言

高血压是心血管疾病(CVD)的重要危险因素。本研究的目的是通过使用决策树算法作为数据挖掘的监督分类方法来确定与高血压相关的因素。

方法

本研究使用了横断面研究的数据。共招募了9078名符合纳入标准的受试者。这些受试者中的70%(6358例)被随机分配到训练数据集以构建决策树。其余30%(2720例)用作测试数据集以评估决策树的性能。本研究评估了两个模型。在模型I中,年龄、性别、体重指数、婚姻状况、教育程度、职业状况、抑郁和焦虑状况、身体活动水平、吸烟状况、低密度脂蛋白、甘油三酯、总胆固醇、空腹血糖、尿酸和高敏C反应蛋白被视为输入变量,在模型II中,年龄、性别、白细胞、红细胞、血红蛋白、血细胞比容、平均红细胞体积、平均红细胞血红蛋白含量、血小板、红细胞分布宽度和血小板分布宽度被视为输入变量。通过构建受试者工作特征(ROC)曲线来评估模型的有效性。

结果

我们人群中高血压的患病率为32%。对于决策树模型I,识别高血压相关危险因素的准确性、敏感性、特异性和ROC曲线下面积(AUC)值分别为73%、63%、77%和0.72。模型II的相应值分别为70%、61%、74%和0.68。

结论

我们开发了一种决策树模型来识别与高血压相关的危险因素,该模型可用于制定高血压管理方案。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索