Lindstrom Michael R, Jung Hyuntae, Larocque Denis
Department of Mathematics, University of California, Los Angeles, CA 90024, USA.
Global Aviation Data Management, International Air Transport Association (IATA), Montréal, QC H2Y 1C6, Canada.
Entropy (Basel). 2020 Nov 30;22(12):1363. doi: 10.3390/e22121363.
We present an unsupervised method to detect anomalous time series among a collection of time series. To do so, we extend traditional Kernel Density Estimation for estimating probability distributions in Euclidean space to Hilbert spaces. The estimated probability densities we derive can be obtained formally through treating each series as a point in a Hilbert space, placing a kernel at those points, and summing the kernels (a "point approach"), or through using Kernel Density Estimation to approximate the distributions of Fourier mode coefficients to infer a probability density (a "Fourier approach"). We refer to these approaches as Functional Kernel Density Estimation for Anomaly Detection as they both yield functionals that can score a time series for how anomalous it is. Both methods naturally handle missing data and apply to a variety of settings, performing well when compared with an outlyingness score derived from a boxplot method for functional data, with a Principal Component Analysis approach for functional data, and with the Functional Isolation Forest method. We illustrate the use of the proposed methods with aviation safety report data from the International Air Transport Association (IATA).
我们提出了一种无监督方法,用于在时间序列集合中检测异常时间序列。为此,我们将用于估计欧几里得空间中概率分布的传统核密度估计扩展到希尔伯特空间。我们推导的估计概率密度可以通过将每个序列视为希尔伯特空间中的一个点,在这些点上放置一个核,并对核进行求和(“点方法”),或者通过使用核密度估计来近似傅里叶模式系数的分布以推断概率密度(“傅里叶方法”)正式获得。我们将这些方法称为用于异常检测的泛函核密度估计,因为它们都产生了可以对时间序列的异常程度进行评分的泛函。这两种方法都能自然地处理缺失数据,并适用于各种设置,与从功能数据的箱线图方法、功能数据的主成分分析方法以及功能隔离森林方法得出的离群值分数相比,表现良好。我们用国际航空运输协会(IATA)的航空安全报告数据说明了所提出方法的使用。