• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于核空间深度函数的异常值检测

Outlier detection with the kernelized spatial depth function.

作者信息

Chen Yixin, Dang Xin, Peng Hanxiang, Bart Henry L

机构信息

Department of Computer and Information Science, University of Mississippi, University, MS 38677, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2009 Feb;31(2):288-305. doi: 10.1109/TPAMI.2008.72.

DOI:10.1109/TPAMI.2008.72
PMID:19110494
Abstract

Statistical depth functions provide from the "deepest" point a "center-outward ordering" of multidimensional data. In this sense, depth functions can measure the "extremeness" or "outlyingness" of a data point with respect to a given data set. Hence, they can detect outliers--observations that appear extreme relative to the rest of the observations. Of the various statistical depths, the spatial depth is especially appealing because of its computational efficiency and mathematical tractability. In this article, we propose a novel statistical depth, the kernelized spatial depth (KSD), which generalizes the spatial depth via positive definite kernels. By choosing a proper kernel, the KSD can capture the local structure of a data set while the spatial depth fails. We demonstrate this by the half-moon data and the ring-shaped data. Based on the KSD, we propose a novel outlier detection algorithm, by which an observation with a depth value less than a threshold is declared as an outlier. The proposed algorithm is simple in structure: the threshold is the only one parameter for a given kernel. It applies to a one-class learning setting, in which "normal" observations are given as the training data, as well as to a missing label scenario, where the training set consists of a mixture of normal observations and outliers with unknown labels. We give upper bounds on the false alarm probability of a depth-based detector. These upper bounds can be used to determine the threshold. We perform extensive experiments on synthetic data and data sets from real applications. The proposed outlier detector is compared with existing methods. The KSD outlier detector demonstrates a competitive performance.

摘要

统计深度函数从“最深”点提供多维数据的“从中心向外排序”。从这个意义上说,深度函数可以衡量一个数据点相对于给定数据集的“极端性”或“离群性”。因此,它们可以检测异常值——相对于其他观测值显得极端的观测值。在各种统计深度中,空间深度因其计算效率和数学易处理性而特别有吸引力。在本文中,我们提出了一种新颖的统计深度,即核化空间深度(KSD),它通过正定核推广了空间深度。通过选择合适的核,KSD可以捕捉数据集的局部结构,而空间深度则无法做到这一点。我们通过半月形数据和环形数据证明了这一点。基于KSD,我们提出了一种新颖的异常值检测算法,通过该算法,深度值小于阈值的观测值被声明为异常值。所提出的算法结构简单:对于给定的核,阈值是唯一的参数。它适用于一类学习设置,其中“正常”观测值作为训练数据,也适用于缺失标签的情况,即训练集由正常观测值和标签未知的异常值混合组成。我们给出了基于深度的检测器的误报概率的上界。这些上界可用于确定阈值。我们对合成数据和来自实际应用的数据集进行了广泛的实验。将所提出的异常值检测器与现有方法进行了比较。KSD异常值检测器表现出具有竞争力的性能。

相似文献

1
Outlier detection with the kernelized spatial depth function.基于核空间深度函数的异常值检测
IEEE Trans Pattern Anal Mach Intell. 2009 Feb;31(2):288-305. doi: 10.1109/TPAMI.2008.72.
2
Motion segmentation and depth ordering using an occlusion detector.使用遮挡检测器进行运动分割和深度排序。
IEEE Trans Pattern Anal Mach Intell. 2008 Jul;30(7):1171-85. doi: 10.1109/TPAMI.2007.70766.
3
A cluster validity measure with outlier detection for support vector clustering.一种用于支持向量聚类的具有离群值检测功能的聚类有效性度量。
IEEE Trans Syst Man Cybern B Cybern. 2008 Feb;38(1):78-89. doi: 10.1109/TSMCB.2007.908862.
4
Kernel discriminant analysis for positive definite and indefinite kernels.用于正定和不定核的核判别分析。
IEEE Trans Pattern Anal Mach Intell. 2009 Jun;31(6):1017-32. doi: 10.1109/TPAMI.2008.290.
5
A small sphere and large margin approach for novelty detection using training data with outliers.一种使用带有离群值的训练数据进行异常检测的小球体与大边缘方法。
IEEE Trans Pattern Anal Mach Intell. 2009 Nov;31(11):2088-92. doi: 10.1109/TPAMI.2009.24.
6
Tracking by affine kernel transformations using color and boundary cues.使用颜色和边界线索通过仿射核变换进行跟踪。
IEEE Trans Pattern Anal Mach Intell. 2009 Jan;31(1):164-71. doi: 10.1109/TPAMI.2008.194.
7
Graph-based semisupervised learning.基于图的半监督学习。
IEEE Trans Pattern Anal Mach Intell. 2008 Jan;30(1):174-9. doi: 10.1109/TPAMI.2007.70765.
8
Approximate matching of digital point sets using a novel angular tree.使用新型角树对数字点集进行近似匹配。
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):769-82. doi: 10.1109/TPAMI.2007.70812.
9
Registration with uncertainties and statistical modeling of shapes with variable metric kernels.具有不确定性的配准以及使用可变度量核的形状统计建模。
IEEE Trans Pattern Anal Mach Intell. 2009 Jan;31(1):99-113. doi: 10.1109/TPAMI.2008.36.
10
2D shape matching by contour flexibility.基于轮廓灵活性的二维形状匹配
IEEE Trans Pattern Anal Mach Intell. 2009 Jan;31(1):180-6. doi: 10.1109/TPAMI.2008.199.

引用本文的文献

1
Anomaly detection based on a dynamic Markov model.基于动态马尔可夫模型的异常检测
Inf Sci (N Y). 2017 Oct;411:52-65. doi: 10.1016/j.ins.2017.05.021. Epub 2017 May 15.
2
Unsupervised Scalable Statistical Method for Identifying Influential Users in Online Social Networks.无监督可扩展的统计方法,用于识别在线社交网络中的有影响力用户。
Sci Rep. 2018 May 3;8(1):6955. doi: 10.1038/s41598-018-24874-2.
3
Graph ranking for exploratory gene data analysis.基因数据探索分析的图排序。
BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S19. doi: 10.1186/1471-2105-10-S11-S19.