决策树方法：分类与预测应用

Decision tree methods: applications for classification and prediction.

作者信息

Song Yan-Yan, Lu Ying

机构信息

Department of Pharmacology and Biostatistics, Institute of Medical Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, China ; Department of Pharmacology and Biostatistics, Institute of Medical Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

出版信息

Shanghai Arch Psychiatry. 2015 Apr 25;27(2):130-5. doi: 10.11919/j.issn.1002-0829.215044.

DOI:10.11919/j.issn.1002-0829.215044

PMID:26120265

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4466856/

Abstract

Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.

摘要

决策树方法是一种常用的数据挖掘方法，用于基于多个协变量建立分类系统或为目标变量开发预测算法。该方法将总体分类为分支状的部分，这些部分构成一棵具有根节点、内部节点和叶节点的倒树。该算法是非参数的，能够有效地处理大型复杂数据集，而无需强加复杂的参数结构。当样本量足够大时，研究数据可分为训练数据集和验证数据集。使用训练数据集构建决策树模型，并使用验证数据集来确定实现最优最终模型所需的合适树大小。本文介绍了用于开发决策树的常用算法（包括CART、C4.5、CHAID和QUEST），并描述了可用于可视化树结构的SPSS和SAS程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf92/4466856/cfdb52d3c664/sap-27-02-130-g002.jpg

相似文献

Decision tree methods: applications for classification and prediction.

Shanghai Arch Psychiatry. 2015 Apr 25;27(2):130-5. doi: 10.11919/j.issn.1002-0829.215044.

A novel approach to build accurate and diverse decision tree forest.

Evol Intell. 2022;15(1):439-453. doi: 10.1007/s12065-020-00519-0. Epub 2021 Jan 3.

A decision tree--based method for the differential diagnosis of Aortic Stenosis from Mitral Regurgitation using heart sounds.

Biomed Eng Online. 2004 Jun 29;3(1):21. doi: 10.1186/1475-925X-3-21.

Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines.

J Environ Manage. 2013 Sep 30;127:166-76. doi: 10.1016/j.jenvman.2013.04.010. Epub 2013 May 21.

Applying Data Mining Techniques to Extract Hidden Patterns about Breast Cancer Survival in an Iranian Cohort Study.

J Res Health Sci. 2016 Winter;16(1):31-5.

An efficient data preprocessing approach for large scale medical data mining.

Technol Health Care. 2015;23(2):153-60. doi: 10.3233/THC-140887.

Development of decision tree classification algorithms in predicting mortality of COVID-19 patients.

Int J Emerg Med. 2024 Sep 27;17(1):126. doi: 10.1186/s12245-024-00681-7.

Predicting the outcome of occupational accidents by CART and CHAID methods at a steel factory in Iran.

J Public Health Res. 2018 Nov 8;7(2):1361. doi: 10.4081/jphr.2018.1361. eCollection 2018 Oct 4.

Logistic Regression-Based Trichotomous Classification Tree and Its Application in Medical Diagnosis.

Med Decis Making. 2016 Nov;36(8):973-89. doi: 10.1177/0272989X15618658. Epub 2016 Jan 20.

Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications.

Biomed Eng Online. 2017 Nov 2;16(1):125. doi: 10.1186/s12938-017-0416-x.

引用本文的文献

Ensemble Learning Framework for Anomaly Detection in Autonomous Driving Systems.

Sensors (Basel). 2025 Aug 17;25(16):5105. doi: 10.3390/s25165105.

ACLPred: an explainable machine learning and tree-based ensemble model for anticancer ligand prediction.

Sci Rep. 2025 Aug 25;15(1):31268. doi: 10.1038/s41598-025-16575-4.

A Pattern Combining the Cognitive and Physical Risks Predicts Frailty Reversal in Community-Dwelling Older Individuals 3 years Later: A Decision Tree Analysis.

Sage Open Aging. 2025 Aug 20;11:30495334251365595. doi: 10.1177/30495334251365595. eCollection 2025 Jan-Dec.

Predictive analytics in gamified education: A hybrid model for identifying at-risk students.

MethodsX. 2025 Jul 3;15:103486. doi: 10.1016/j.mex.2025.103486. eCollection 2025 Dec.

Development and validation of a machine learning model for prediction of cephalic dystocia.

BMC Pregnancy Childbirth. 2025 Aug 18;25(1):862. doi: 10.1186/s12884-025-07972-8.

Seed quality drives grain yield in Ethiopian and Senegalese sorghum: Insights from machine learning.

PLoS One. 2025 Aug 14;20(8):e0329366. doi: 10.1371/journal.pone.0329366. eCollection 2025.

Methodological Techniques Used in Machine Learning to Support Individualized Drug Dosing Regimens Based on Pharmacokinetic Data: A Scoping Review.

Clin Pharmacokinet. 2025 Aug 14. doi: 10.1007/s40262-025-01547-8.

FFT-RDNet: A Time-Frequency-Domain-Based Intrusion Detection Model for IoT Security.

Sensors (Basel). 2025 Jul 24;25(15):4584. doi: 10.3390/s25154584.

Investigation of Growth Differentiation Factor 15 as a Prognostic Biomarker for Major Adverse Limb Events in Peripheral Artery Disease.

J Clin Med. 2025 Jul 24;14(15):5239. doi: 10.3390/jcm14155239.

Beyond Treatment Decisions: The Predictive Value of Comprehensive Geriatric Assessment in Older Cancer Patients.

Cancers (Basel). 2025 Jul 28;17(15):2489. doi: 10.3390/cancers17152489.

本文引用的文献

Comments on Fifty Years of Classification and Regression Trees.

Int Stat Rev. 2014 Dec 1;82(3):359-361. doi: 10.1111/insr.12060.

Opportunities for prevention and intervention with young children: lessons from the Canadian incidence study of reported child abuse and neglect.

Child Adolesc Psychiatry Ment Health. 2013 Feb 13;7(1):4. doi: 10.1186/1753-2000-7-4.

Tree-structured subgroup analysis of receiver operating characteristic curves for diagnostic tests.

Acad Radiol. 2012 Dec;19(12):1529-36. doi: 10.1016/j.acra.2012.09.007.

Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach.

BMC Psychiatry. 2009 Nov 22;9:75. doi: 10.1186/1471-244X-9-75.

A procedure for determining whether a simple combination of diagnostic tests may be noninferior to the theoretical optimum combination.

Med Decis Making. 2008 Nov-Dec;28(6):909-16. doi: 10.1177/0272989X08318462. Epub 2008 Jun 12.

Alternative tree-structured survival analysis based on variance of survival time.

Med Decis Making. 2004 Nov-Dec;24(6):670-80. doi: 10.1177/0272989X04271048.

Classification algorithms for hip fracture prediction based on recursive partitioning methods.

Med Decis Making. 2004 Jul-Aug;24(4):386-98. doi: 10.1177/0272989X04267009.

Residual-based tree-structured survival analysis.

Stat Med. 2002 Jan 30;21(2):313-26. doi: 10.1002/sim.981.

Features of tree-structured survival analysis.

Epidemiology. 1997 Jul;8(4):344-6.

A comparison of estimated proportional hazards models and regression trees.

Stat Med. 1989 May;8(5):539-50. doi: 10.1002/sim.4780080503.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

决策树方法：分类与预测应用

Decision tree methods: applications for classification and prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献