Suppr超能文献

一种用于高维数据的混合半监督异常检测模型。

A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data.

机构信息

Information and Telecommunication Engineering College, Beijing University of Posts and Telecommunications, Beijing, China.

出版信息

Comput Intell Neurosci. 2017;2017:8501683. doi: 10.1155/2017/8501683. Epub 2017 Nov 15.

Abstract

Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble -nearest neighbor graphs- (-NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.

摘要

异常检测旨在识别偏离标称样本的观测值,这对于高维数据来说是一项具有挑战性的任务。传统的基于距离的异常检测方法计算每个观测值之间的邻域距离,并且在高维空间中受到维度诅咒的影响;例如,任何两个样本之间的距离都很相似,并且每个样本可能表现得像异常值。在本文中,我们提出了一种用于高维数据的混合半监督异常检测模型,该模型由两部分组成:深度自动编码器(DAE)和基于集成的最近邻图(-NNG-)的异常检测器。受益于非线性映射的能力,首先训练 DAE 以学习高维数据集的内在特征,以便在更紧凑的子空间中表示高维数据。然后从整个数据集的不同子集中随机抽样构建几个基于非参数 KNN 的异常检测器。最终的预测由所有异常检测器做出。在几个真实数据集上评估了所提出方法的性能,结果证实了所提出的混合模型提高了检测精度并降低了计算复杂度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb51/5706085/d6dbc30649c7/CIN2017-8501683.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验