School of Information Science and Engineering, Xinjiang University, Ürümqi, 830046, China.
Sci Rep. 2023 Feb 10;13(1):2408. doi: 10.1038/s41598-023-29549-1.
Outlier detection is an important topic in machine learning and has been used in a wide range of applications. Outliers are objects that are few in number and deviate from the majority of objects. As a result of these two properties, we show that outliers are susceptible to a mechanism called fluctuation. This article proposes a method called fluctuation-based outlier detection (FBOD) that achieves a low linear time complexity and detects outliers purely based on the concept of fluctuation without employing any distance, density or isolation measure. Fundamentally different from all existing methods. FBOD first converts the Euclidean structure datasets into graphs by using random links, then propagates the feature value according to the connection of the graph. Finally, by comparing the difference between the fluctuation of an object and its neighbors, FBOD determines the object with a larger difference as an outlier. The results of experiments comparing FBOD with eight state-of-the-art algorithms on eight real-worlds tabular datasets and three video datasets show that FBOD outperforms its competitors in the majority of cases and that FBOD has only 5% of the execution time of the fastest algorithm. The experiment codes are available at: https://github.com/FluctuationOD/Fluctuation-based-Outlier-Detection .
离群点检测是机器学习中的一个重要课题,已经在广泛的应用中得到了应用。离群点是数量较少且偏离大多数对象的对象。由于这两个特性,我们表明离群点容易受到一种称为波动的机制的影响。本文提出了一种称为基于波动的离群点检测(FBOD)的方法,该方法实现了低线性时间复杂度,并纯粹基于波动的概念检测离群点,而不使用任何距离、密度或隔离措施。与所有现有的方法从根本上不同。FBOD 首先通过使用随机链接将欧几里得结构数据集转换为图,然后根据图的连接传播特征值。最后,通过比较对象的波动与其邻居之间的差异,FBOD 将差异较大的对象确定为离群点。在八个真实表格数据集和三个视频数据集上,将 FBOD 与八个最先进的算法进行比较的实验结果表明,在大多数情况下,FBOD 优于其竞争对手,而 FBOD 的执行时间仅为最快算法的 5%。实验代码可在:https://github.com/FluctuationOD/Fluctuation-based-Outlier-Detection。