* Institute of Computational Intelligence, Czestochowa University of Technology, Al. Armii Krajowej 36, 42-200 Czestochowa, Poland.
† Information Technology Institute, Academy of Social Sciences, 90-113 Łódź, Poland.
Int J Neural Syst. 2018 Mar;28(2):1750048. doi: 10.1142/S0129065717500484. Epub 2017 Oct 12.
One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.
数据挖掘中最大的挑战之一与处理和分析大规模数据流有关。与传统的静态数据挖掘问题相反,数据流要求每个元素只能处理一次,分配的内存量是固定的,并且模型包含所研究流的变化。现有的大多数方法都是为数据流分类开发的,只有少数方法尝试使用各种启发式方法解决回归问题。在本文中,我们开发了在时变环境中工作的数学上合理的回归模型。更具体地说,我们研究了广义回归神经网络的增量版本,称为 IGRNN,并证明了它们在各种概念漂移场景下的跟踪特性 - 弱(概率)和强(概率为一)收敛。首先,我们提出了基于 Parzen 核的 IGRNN,用于在非平稳噪声下对静止系统进行建模。接下来,我们将我们的方法扩展到在非平稳噪声下对时变系统进行建模。我们提出了几种类型的概念漂移,以便我们的方法在某些条件下保持弱和强收敛。最后,在一系列模拟中,我们将我们的方法与基于遗忘机制或滑动窗口的常用启发式方法进行比较,以处理概念漂移。最后,我们将我们的概念应用于实际场景中,解决货币汇率预测问题。