Suppr超能文献

数据流的收敛时变回归模型:通过递归基于 Parzen 的广义回归神经网络跟踪概念漂移。

Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks.

机构信息

* Institute of Computational Intelligence, Czestochowa University of Technology, Al. Armii Krajowej 36, 42-200 Czestochowa, Poland.

† Information Technology Institute, Academy of Social Sciences, 90-113 Łódź, Poland.

出版信息

Int J Neural Syst. 2018 Mar;28(2):1750048. doi: 10.1142/S0129065717500484. Epub 2017 Oct 12.

Abstract

One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.

摘要

数据挖掘中最大的挑战之一与处理和分析大规模数据流有关。与传统的静态数据挖掘问题相反,数据流要求每个元素只能处理一次,分配的内存量是固定的,并且模型包含所研究流的变化。现有的大多数方法都是为数据流分类开发的,只有少数方法尝试使用各种启发式方法解决回归问题。在本文中,我们开发了在时变环境中工作的数学上合理的回归模型。更具体地说,我们研究了广义回归神经网络的增量版本,称为 IGRNN,并证明了它们在各种概念漂移场景下的跟踪特性 - 弱(概率)和强(概率为一)收敛。首先,我们提出了基于 Parzen 核的 IGRNN,用于在非平稳噪声下对静止系统进行建模。接下来,我们将我们的方法扩展到在非平稳噪声下对时变系统进行建模。我们提出了几种类型的概念漂移,以便我们的方法在某些条件下保持弱和强收敛。最后,在一系列模拟中,我们将我们的方法与基于遗忘机制或滑动窗口的常用启发式方法进行比较,以处理概念漂移。最后,我们将我们的概念应用于实际场景中,解决货币汇率预测问题。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验