Fukuda Kensuke, Eugene Stanley H, Nunes Amaral Luís A
NTT Network Innovation Laboratories, Tokyo 180-8585, Japan.
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Feb;69(2 Pt 1):021108. doi: 10.1103/PhysRevE.69.021108. Epub 2004 Feb 25.
Many phenomena, both natural and human influenced, give rise to signals whose statistical properties change under time translation, i.e., are nonstationary. For some practical purposes, a nonstationary time series can be seen as a concatenation of stationary segments. However, the exact segmentation of a nonstationary time series is a hard computational problem which cannot be solved exactly by existing methods. For this reason, heuristic methods have been proposed. Using one such method, it has been reported that for several cases of interest-e.g., heart beat data and Internet traffic fluctuations-the distribution of durations of these stationary segments decays with a power-law tail. A potential technical difficulty that has not been thoroughly investigated is that a nonstationary time series with a (scalefree) power-law distribution of stationary segments is harder to segment than other nonstationary time series because of the wider range of possible segment lengths. Here, we investigate the validity of a heuristic segmentation algorithm recently proposed by Bernaola-Galván et al. [Phys. Rev. Lett. 87, 168105 (2001)] by systematically analyzing surrogate time series with different statistical properties. We find that if a given nonstationary time series has stationary periods whose length is distributed as a power law, the algorithm can split the time series into a set of stationary segments with the correct statistical properties. We also find that the estimated power-law exponent of the distribution of stationary-segment lengths is affected by (i) the minimum segment length and (ii) the ratio R identical with sigma(epsilon)/sigma(x), where sigma(x) is the standard deviation of the mean values of the segments and sigma(epsilon) is the standard deviation of the fluctuations within a segment. Furthermore, we determine that the performance of the algorithm is generally not affected by uncorrelated noise spikes or by weak long-range temporal correlations of the fluctuations within segments.
许多现象,包括自然现象和受人类影响的现象,都会产生其统计特性随时间平移而变化的信号,即这些信号是非平稳的。出于某些实际目的,非平稳时间序列可被视为平稳段的拼接。然而,对非平稳时间序列进行精确分割是一个困难的计算问题,现有方法无法精确解决。因此,人们提出了启发式方法。据报道,使用其中一种方法,对于一些感兴趣的情况,例如心跳数据和互联网流量波动,这些平稳段的持续时间分布具有幂律尾部衰减。一个尚未得到充分研究的潜在技术难题是,具有(无标度)平稳段幂律分布的非平稳时间序列比其他非平稳时间序列更难分割,因为可能的段长度范围更广。在此,我们通过系统分析具有不同统计特性的替代时间序列,研究了Bernaola-Galván等人[《物理评论快报》87, 168105 (2001)]最近提出的一种启发式分割算法的有效性。我们发现,如果给定的非平稳时间序列具有长度服从幂律分布的平稳周期,该算法可以将时间序列分割成一组具有正确统计特性的平稳段。我们还发现,平稳段长度分布的估计幂律指数受以下因素影响:(i) 最小段长度;(ii) 与sigma(epsilon)/sigma(x)相同的比率R,其中sigma(x)是段均值的标准差,sigma(epsilon)是段内波动的标准差。此外,我们确定该算法的性能通常不受不相关噪声尖峰或段内波动的弱长程时间相关性的影响。