Suppr超能文献

基于无监督可能性和模糊聚类的两种新型异常值检测方法。

Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering.

作者信息

Cebeci Zeynel, Cebeci Cagatay, Tahtali Yalcin, Bayyurt Lutfi

机构信息

Department of Animal Science, Faculty of Agriculture, Cukurova University, Adana, Turkey.

Department of Electronics & Electrical Engineering, University of Strathclyde, Glasgow, United Kingdom.

出版信息

PeerJ Comput Sci. 2022 Sep 27;8:e1060. doi: 10.7717/peerj-cs.1060. eCollection 2022.

Abstract

Outliers are data points that significantly deviate from other data points in a data set because of different mechanisms or unusual processes. Outlier detection is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for data cleansing in data science. In this study, we propose two novel outlier detection approaches using the typicality degrees which are the partitioning result of unsupervised possibilistic clustering algorithms. The proposed approaches are based on finding the atypical data points below a predefined threshold value, a possibilistic level for evaluating a point as an outlier. The experiments on the synthetic and real data sets showed that the proposed approaches can be successfully used to detect outliers without considering the structure and distribution of the features in multidimensional data sets.

摘要

异常值是数据集中由于不同机制或异常过程而显著偏离其他数据点的数据点。异常值检测是一个经过深入研究的研究主题,除了用于数据科学中的数据清理外,还用于识别新奇事物、欺诈行为、异常、偏差或例外情况。在本研究中,我们提出了两种新颖的异常值检测方法,这些方法使用典型度,典型度是无监督可能性聚类算法的划分结果。所提出的方法基于找到低于预定义阈值的非典型数据点,该阈值是将一个点评估为异常值的可能性水平。在合成数据集和真实数据集上的实验表明,所提出的方法可以成功地用于检测异常值,而无需考虑多维数据集中特征的结构和分布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5990/9575855/5696ede3e6b7/peerj-cs-08-1060-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验