增量分层判别回归

Incremental hierarchical discriminant regression.

作者信息

Weng Juyang, Hwang Wey-Shiuan

机构信息

Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.

出版信息

IEEE Trans Neural Netw. 2007 Mar;18(2):397-415. doi: 10.1109/TNN.2006.889942.

DOI:10.1109/TNN.2006.889942

PMID:17385628

Abstract

This paper presents incremental hierarchical discriminant regression (IHDR) which incrementally builds a decision tree or regression tree for very high-dimensional regression or decision spaces by an online, real-time learning system. Biologically motivated, it is an approximate computational model for automatic development of associative cortex, with both bottom-up sensory inputs and top-down motor projections. At each internal node of the IHDR tree, information in the output space is used to automatically derive the local subspace spanned by the most discriminating features. Embedded in the tree is a hierarchical probability distribution model used to prune very unlikely cases during the search. The number of parameters in the coarse-to-fine approximation is dynamic and data-driven, enabling the IHDR tree to automatically fit data with unknown distribution shapes (thus, it is difficult to select the number of parameters up front). The IHDR tree dynamically assigns long-term memory to avoid the loss-of-memory problem typical with a global-fitting learning algorithm for neural networks. A major challenge for an incrementally built tree is that the number of samples varies arbitrarily during the construction process. An incrementally updated probability model, called sample-size-dependent negative-log-likelihood (SDNLL) metric is used to deal with large sample-size cases, small sample-size cases, and unbalanced sample-size cases, measured among different internal nodes of the IHDR tree. We report experimental results for four types of data: synthetic data to visualize the behavior of the algorithms, large face image data, continuous video stream from robot navigation, and publicly available data sets that use human defined features.

摘要

本文提出了增量分层判别回归（IHDR）方法，该方法通过在线实时学习系统为超高维回归或决策空间逐步构建决策树或回归树。受生物学启发，它是一种用于联想皮层自动发育的近似计算模型，兼具自下而上的感觉输入和自上而下的运动投射。在IHDR树的每个内部节点，输出空间中的信息用于自动推导由最具判别力的特征所跨越的局部子空间。树中嵌入了一个分层概率分布模型，用于在搜索过程中剔除极不可能的情况。从粗到精近似中的参数数量是动态且由数据驱动的，这使得IHDR树能够自动拟合分布形状未知的数据（因此，很难预先选择参数数量）。IHDR树动态分配长期记忆，以避免神经网络全局拟合学习算法中典型的记忆丢失问题。对于逐步构建的树来说，一个主要挑战是在构建过程中样本数量会任意变化。一种逐步更新的概率模型，称为样本大小相关负对数似然（SDNLL）度量，用于处理大样本量情况、小样本量情况以及不平衡样本量情况，这些情况是在IHDR树的不同内部节点之间进行度量的。我们报告了针对四种类型数据的实验结果：用于可视化算法行为的合成数据、大型面部图像数据、机器人导航的连续视频流以及使用人工定义特征的公开可用数据集。