Leung Kwong-Sak, Wong Man-Leung, Lam Wai, Wang Zhenyuan, Xu Kebin
Dept. of Comput. Sci. & Eng., Chinese Univ. of Hong Kong, Shatin, China.
IEEE Trans Syst Man Cybern B Cybern. 2002;32(5):630-44. doi: 10.1109/TSMCB.2002.1033182.
This paper describes a novel knowledge discovery and data mining framework dealing with nonlinear interactions among domain attributes. Our network-based model provides an effective and efficient reasoning procedure to perform prediction and decision making. Unlike many existing paradigms based on linear models, the attribute relationship in our framework is represented by nonlinear nonnegative multiregressions based on the Choquet integral. This kind of multiregression is able to model a rich set of nonlinear interactions directly. Our framework involves two layers. The outer layer is a network structure consisting of network elements as its components, while the inner layer is concerned with a particular network element modeled by Choquet integrals. We develop a fast double optimization algorithm (FDOA) for learning the multiregression coefficients of a single network element. Using this local learning component and multiregression-residual-cost evolutionary programming (MRCEP), we propose a global learning algorithm, called MRCEP-FDOA, for discovering the network structures and their elements from databases. We have conducted a series of experiments to assess the effectiveness of our algorithm and investigate the performance under different parameter combinations, as well as sizes of the training data sets. The empirical results demonstrate that our framework can successfully discover the target network structure and the regression coefficients.
本文描述了一种处理领域属性间非线性相互作用的新型知识发现与数据挖掘框架。我们基于网络的模型提供了一种有效且高效的推理程序来进行预测和决策。与许多现有的基于线性模型的范式不同,我们框架中的属性关系由基于Choquet积分的非线性非负多元回归表示。这种多元回归能够直接对丰富的非线性相互作用集进行建模。我们的框架包含两层。外层是一个由网络元素作为其组件构成的网络结构,而内层则关注由Choquet积分建模的特定网络元素。我们开发了一种快速双重优化算法(FDOA)来学习单个网络元素的多元回归系数。利用这个局部学习组件和多元回归残差成本进化规划(MRCEP),我们提出了一种全局学习算法,称为MRCEP - FDOA,用于从数据库中发现网络结构及其元素。我们进行了一系列实验来评估我们算法的有效性,并研究在不同参数组合以及训练数据集大小情况下的性能。实证结果表明,我们的框架能够成功发现目标网络结构和回归系数。