基于深度学习的决策树集成算法在不完全医疗数据集上的应用。

Deep learning based decision tree ensembles for incomplete medical datasets.

机构信息

Division of Thoracic Surgery, Chang Gung Memorial Hospital at Linkou, Taoyuan, Taiwan.

Department of Information Management, National Central University, Taoyuan, Taiwan.

出版信息

Technol Health Care. 2024;32(1):75-87. doi: 10.3233/THC-220514.

DOI:10.3233/THC-220514

PMID:37248924

Abstract

BACKGROUND

In practice, the collected datasets for data analysis are usually incomplete as some data contain missing attribute values. Many related works focus on constructing specific models to produce estimations to replace the missing values, to make the original incomplete datasets become complete. Another type of solution is to directly handle the incomplete datasets without missing value imputation, with decision trees being the major technique for this purpose.

OBJECTIVE

To introduce a novel approach, namely Deep Learning-based Decision Tree Ensembles (DLDTE), which borrows the bounding box and sliding window strategies used in deep learning techniques to divide an incomplete dataset into a number of subsets and learning from each subset by a decision tree, resulting in decision tree ensembles.

METHOD

Two medical domain problem datasets contain several hundred feature dimensions with the missing rates of 10% to 50% are used for performance comparison.

RESULTS

The proposed DLDTE provides the highest rate of classification accuracy when compared with the baseline decision tree method, as well as two missing value imputation methods (mean and k-nearest neighbor), and the case deletion method.