Suppr超能文献

[分类树模型在缺血性脑卒中危险因素筛查中的应用研究]

[Study on the application of classification tree model in screening the risk factors of ischemic stroke].

作者信息

Yao Shuang, Li Hao, Liu Kaixiang, Leng Guangpeng, Yu Jian

机构信息

Department of Endocrinology, the Affiliated Hospital of Guilin Medical College, Guilin 541001, Guangxi Zhuang Antonomous Region, China (Yao S, Yu J); Department of Neurology, the Affiliated Hospital of Guilin Medical College, Guilin 541001, Guangxi Zhuang Antonomous Region, China (Li H, Liu KX); Department of Cardiovascular Medicine, the Second Affiliated Hospital of Medical College, Guilin Medical University, Guilin 541100, Guangxi Zhuang Antonomous Region, China (Leng GP). Corresponding author: Yu Jian, Email:

出版信息

Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2018 Oct;30(10):973-977. doi: 10.3760/cma.j.issn.2095-4352.2018.010.014.

Abstract

OBJECTIVE

To construct a prediction model for the risk of ischemic stroke (IS) by classification tree model, and evaluate its application value.

METHODS

By cluster sampling, 858 IS patients with perfect clinical data from January to December 2017 in the Affiliated Hospital of Guilin Medical College (IS group) were enrolled, and 844 health checkups matched with the gender and age of IS patients in the same period were enrolled as controls (healthy control group). The metabolic characteristics of the two groups were compared and analyzed. The classification tree model was used to construct the prediction model of the risk of IS, and the gain diagram, index chart, risk value of misclassification probability and receiver operating characteristic curve (ROC) were used to evaluate the application value of the model.

RESULTS

Compared with the healthy control group, body mass index (BMI), fasting blood glucose (FPG), triglyceride (TG), total cholesterol (TC), low density lipoprotein cholesterol (LDL-C) in IS group were significantly increased [BMI (kg/m): 25.34±3.70 vs. 24.24±3.10, FPG (mmol/L): 6.79±2.89 vs. 5.73±1.17, TG (mmol/L): 1.62±1.06 vs. 1.44±1.06, TC (mmol/L): 4.70±2.73 vs. 4.35±0.79, LDL-C (mmol/L): 3.18±0.94 vs. 2.73±0.73, all P < 0.01], high density lipoprotein cholesterol (HDL-C) was significantly decreased (mmol/L: 1.12±0.33 vs. 1.35±0.36, P < 0.01), and the proportion of hypertension, smoking and drinking were significantly increased (69.0% vs. 41.9%, 23.1% vs. 16.8%, 19.2% vs. 13.4%, all P < 0.01). By assigning values to each factor [IS: No = 0,Yes = 1; BMI: < 24.0 kg/m = 0, ≥ 24.0 kg/m = 1; FPG: < 7.0 mmol/L = 0, ≥ 7.0 mmol/L = 1; TG: < 2.26 mmol/L = 0, ≥ 2.26 mmol/L = 1; TC: < 6.22 mmol/L = 0, ≥ 6.22 mmol/L = 1; LDL-C: < 4.14 mmol/L = 0, ≥ 4.14 mmol/L = 1; HDL-C: < 1.04 mmol/L = 0, ≥ 1.04 mmol/L = 1; hypertension: No = 0,Yes = 1; smoking: No = 0,Yes = 1; drinking: No = 0,Yes = 1], a classification tree model was established to analyze the risk factors of IS. The classification tree model consisted of 4 layers and 17 nodes: the first layer was hypertension, the second layer was FPG and HDL-C, the third layer was HDL-C and FPG, and the fourth layer was LDL-C and smoking. There were five explanatory variables screened out in the model, including hypertension, FPG, HDL-C, LDL-C and smoking. The first layer of the tree showed that the incidence of IS in hypertensive population (62.6%) was significantly higher than that in non-hypertensive population (35.2%). The second layer of the tree showed that the incidence of IS in people with hypertension with HDL-C ≥ 1.04 mmol/L (53.6%) was lower than that in people with HDL-C < 1.04 mmol/L (78.5%). However, in the population without hypertension, the probability of IS occurrence in the population with FPG ≥ 7.0 mmol/L (71.1%) was significantly higher than that in the population with FPG < 7.0 mmol/L (28.3%). The third layer of the tree showed that the IS incidence of HDL-C ≥ 1.04 mmol/L (21.8%) was lower than that of HDL-C < 1.04 mmol/L (48.7%) in the population without hypertension and FPG < 7.0 mmol/L. However, in the population with hypertension and HDL-C ≥ 1.04 mmol/L, the probability of IS occurrence in the population with FPG ≥ 7.0 mmol/L (78.6%) was significantly higher than that in the population with FPG < 7.0 mmol/L (46.7%). The fourth layer of the tree showed that the IS incidence of people with LDL-C ≥ 4.14 mmol/L (53.8%) was higher than that of people with LDL-C < 4.14 mmol/L (19.0%) in the population without hypertension, FPG < 7.0 mmol/L and HDL-C ≥ 1.04 mmol/L. In the population without hypertension, the incidence of IS in smokers (76.9%) was higher than that in non-smokers (39.1%) of people with FPG < 7.0 mmol/L and HDL-C < 1.04 mmol/L. In the population with hypertension, the probability of IS occurrence in the population with LDL-C ≥ 4.14 mmol/L (72.5%) was higher than that in the population with LDL-C < 4.14 mmol/L (44.4 %) of people with HDL-C ≥ 1.04 mmol/L and FPG < 7.0 mmol/L. The gain diagram of IS classification tree model shown that the gain value increased rapidly from 0% to 100% and then tended to be stable. The index chart shown that the index value kept stable in the moving direction from above 100% and then dropped rapidly to 100%, indicating the model was very well. The risk value of misclassification probability of the classification tree model was 0.291, and the correct rate of risk factor for IS patients was 70.90%. The area under ROC curve (AUC) was 78.0% [95% confidence interval (95%CI) = 75.9%-79.9%, P < 0.001], the sensitivity was 62.5% (95%CI = 59.1%-65.7%) and the specificity was 79.4% (95%CI = 76.5%-82.1%).

CONCLUSIONS

Classification tree model can properly predict the risk factor of IS, and the most important risk factors are hypertension, hyperglycemia, high LDL-C and smoking.

摘要

目的

采用分类树模型构建缺血性脑卒中(IS)风险预测模型,并评估其应用价值。

方法

采用整群抽样法,选取桂林医学院附属医院2017年1月至12月临床资料完整的858例IS患者作为IS组,选取同期与IS患者性别、年龄相匹配的844例健康体检者作为对照组(健康对照组)。比较分析两组的代谢特征。采用分类树模型构建IS风险预测模型,并用增益图、指标图、误分类概率风险值及受试者工作特征曲线(ROC)评估模型的应用价值。

结果

与健康对照组比较,IS组体质指数(BMI)、空腹血糖(FPG)、甘油三酯(TG)、总胆固醇(TC)、低密度脂蛋白胆固醇(LDL-C)显著升高[BMI(kg/m²):25.34±3.70比24.24±3.10,FPG(mmol/L):6.79±2.89比5.73±1.17,TG(mmol/L):1.62±1.06比1.44±1.06,TC(mmol/L):4.70±2.73比4.35±0.79,LDL-C(mmol/L):3.18±0.94比2.73±0.73,均P<0.01],高密度脂蛋白胆固醇(HDL-C)显著降低(mmol/L:1.12±0.33比1.35±0.36,P<0.01),高血压、吸烟、饮酒比例显著升高(69.0%比41.9%,23.1%比16.8%,19.2%比13.4%,均P<0.01)。对各因素赋值[IS:否=0,是=1;BMI:<24.0kg/m²=0,≥24.0kg/m²=1;FPG:<7.0mmol/L=0,≥7.0mmol/L=1;TG:<2.26mmol/L=0,≥2.26mmol/L=1;TC:<6.22mmol/L=0,≥6.22mmol/L=1;LDL-C:<4.14mmol/L=0,≥4.14mmol/L=1;HDL-C:<1.04mmol/L=0,≥1.04mmol/L=1;高血压:否=0,是=1;吸烟:否=0,是=1;饮酒:否=0,是=1],建立分类树模型分析IS的危险因素。分类树模型由4层17个节点组成:第一层为高血压,第二层为FPG和HDL-C,第三层为HDL-C和FPG,第四层为LDL-C和吸烟。模型筛选出5个解释变量,包括高血压、FPG、HDL-C、LDL-C和吸烟。树的第一层显示,高血压人群IS发病率(62.6%)显著高于非高血压人群(35.2%)。树的第二层显示,HDL-C≥1.04mmol/L的高血压患者IS发病率(53.6%)低于HDL-C<1.04mmol/L的患者(78.5%)。然而,在无高血压人群中,FPG≥7.0mmol/L人群IS发生概率(71.1%)显著高于FPG<7.0mmol/L人群(28.3%)。树的第三层显示,无高血压且FPG<7.0mmol/L人群中,HDL-C≥1.04mmol/L者IS发病率(21.8%)低于HDL-C<1.04mmol/L者(48.7%)。然而,在高血压且HDL-C≥1.04mmol/L人群中,FPG≥7.0mmol/L人群IS发生概率(78.6%)显著高于FPG<7.0mmol/L人群(46.7%)。树的第四层显示,无高血压、FPG<7.0mmol/L且HDL-C≥1.04mmol/L人群中,LDL-C≥4.14mmol/L者IS发病率(53.8%)高于LDL-C<4.14mmol/L者(1

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验