Duda Piotr, Rutkowski Leszek, Jaworski Maciej, Rutkowska Danuta
IEEE Trans Cybern. 2020 Apr;50(4):1683-1696. doi: 10.1109/TCYB.2018.2877611. Epub 2018 Nov 15.
In this paper, we propose a recursive variant of the Parzen kernel density estimator (KDE) to track changes of dynamic density over data streams in a nonstationary environment. In stationary environments, well-established traditional KDE techniques have nice asymptotic properties. Their existing extensions to deal with stream data are mostly based on various heuristic concepts (losing convergence properties). In this paper, we study recursive KDEs, called recursive concept drift tracking KDEs, and prove their weak (in probability) and strong (with probability one) convergence, resulting in perfect tracking properties as the sample size approaches infinity. In three theorems and subsequent examples, we show how to choose the bandwidth and learning rate of a recursive KDE in order to ensure weak and strong convergence. The simulation results illustrate the effectiveness of our algorithm both for density estimation and classification over time-varying stream data.
在本文中,我们提出了Parzen核密度估计器(KDE)的递归变体,用于在非平稳环境中跟踪数据流上动态密度的变化。在平稳环境中,成熟的传统KDE技术具有良好的渐近性质。它们现有的处理流数据的扩展大多基于各种启发式概念(失去了收敛性质)。在本文中,我们研究递归KDE,即递归概念漂移跟踪KDE,并证明它们的弱(依概率)收敛和强(以概率1)收敛,从而在样本量趋于无穷大时具有完美的跟踪性质。在三个定理及后续示例中,我们展示了如何选择递归KDE的带宽和学习率以确保弱收敛和强收敛。仿真结果说明了我们的算法在随时间变化的流数据的密度估计和分类方面的有效性。