NASA Goddard Space Flight Center, Greenbelt, MD 20771; Defense Mapping Agency, Washington, DC 20315.
IEEE Trans Pattern Anal Mach Intell. 1982 Jan;4(1):51-7. doi: 10.1109/tpami.1982.4767195.
The classification of large dimensional data sets arising from the merging of remote sensing data with more traditional forms of ancillary data causes a significant computational problem. Decision tree classification is a popular approach to the problem. This type of classifier is characterized by the property that samples are subjected to a sequence of decision rules before they are assigned to a unique class. If a decision tree classifier is well designed, the result in many cases is a classification scheme which is accurate, flexible, and computationally efficient. This correspondence provides an automated technique for effective decision tree design which relies only on a priori statistics. This procedure utilizes canonical transforms and Bayes table look-up decision rules. An optimal design at each node is derived based on the associated decision table. A procedure for computing the global probability of correct classification is also provided. An example is given in which class statistics obtained from an actual Landsat scene are used as input to the program. The resulting decision tree design has an associated probability of correct classification of 0.75 compared to the theoretically optimum 0.79 probability of correct classification associated with a full dimensional Bayes classifier. Recommendations for future research are included.
大型数据集的分类,这些数据集是通过将遥感数据与更传统的辅助数据形式合并而产生的,这会导致一个重大的计算问题。决策树分类是解决这个问题的一种流行方法。这种分类器的特点是样本在被分配到唯一的类别之前,要经过一系列的决策规则。如果决策树分类器设计得好,那么在许多情况下,分类方案是准确的、灵活的、计算效率高的。本文提供了一种仅依赖于先验统计的有效决策树设计的自动化技术。该过程利用规范变换和贝叶斯表查找决策规则。基于相关的决策表,在每个节点上导出最优设计。还提供了一种计算正确分类全局概率的方法。给出了一个例子,其中使用来自实际 Landsat 场景的类统计信息作为程序的输入。所得到的决策树设计的正确分类概率为 0.75,而与全维贝叶斯分类器相关联的理论最优 0.79 的正确分类概率相比,其关联概率为 0.75。包括对未来研究的建议。