• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

异构信息网络中基于查询的离群点检测

Query-Based Outlier Detection in Heterogeneous Information Networks.

作者信息

Kuck Jonathan, Zhuang Honglei, Yan Xifeng, Cam Hasan, Han Jiawei

机构信息

Department of Computer Science, University of Illinois at Urbana-Champaign.

Computer Science Department, University of California at Santa Barbara.

出版信息

Adv Database Technol. 2015 Mar;2015:325-336. doi: 10.5441/002/edbt.2015.29.

DOI:10.5441/002/edbt.2015.29
PMID:27064397
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4825692/
Abstract

Outlier or anomaly detection in large data sets is a fundamental task in data science, with broad applications. However, in real data sets with high-dimensional space, most outliers are hidden in certain dimensional combinations and are relative to a user's search space and interest. It is often more effective to give power to users and allow them to specify outlier queries flexibly, and the system will then process such mining queries efficiently. In this study, we introduce the concept of query-based outlier in heterogeneous information networks, design a query language to facilitate users to specify such queries flexibly, define a good outlier measure in heterogeneous networks, and study how to process outlier queries efficiently in large data sets. Our experiments on real data sets show that following such a methodology, interesting outliers can be defined and uncovered flexibly and effectively in large heterogeneous networks.

摘要

大数据集中的离群值或异常检测是数据科学中的一项基本任务,具有广泛的应用。然而,在高维空间的实际数据集中,大多数离群值隐藏在某些维度组合中,并且与用户的搜索空间和兴趣相关。赋予用户权力并允许他们灵活指定离群值查询通常更有效,然后系统将高效地处理此类挖掘查询。在本研究中,我们引入了异构信息网络中基于查询的离群值概念,设计了一种查询语言以方便用户灵活指定此类查询,在异构网络中定义了一种良好的离群值度量,并研究了如何在大数据集中高效地处理离群值查询。我们在实际数据集上的实验表明,遵循这种方法,可以在大型异构网络中灵活有效地定义和发现有趣的离群值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/2e9a67444964/nihms743073f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/fb64788a706c/nihms743073f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/1ac8bc2d37f7/nihms743073f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/4f0133b41b7f/nihms743073f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/9398f7aac76b/nihms743073f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/2e9a67444964/nihms743073f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/fb64788a706c/nihms743073f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/1ac8bc2d37f7/nihms743073f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/4f0133b41b7f/nihms743073f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/9398f7aac76b/nihms743073f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2413/4825692/2e9a67444964/nihms743073f5.jpg

相似文献

1
Query-Based Outlier Detection in Heterogeneous Information Networks.异构信息网络中基于查询的离群点检测
Adv Database Technol. 2015 Mar;2015:325-336. doi: 10.5441/002/edbt.2015.29.
2
Stratification-Based Outlier Detection over the Deep Web.基于分层的深网异常检测
Comput Intell Neurosci. 2016;2016:7386517. doi: 10.1155/2016/7386517. Epub 2016 May 25.
3
Visually defining and querying consistent multi-granular clinical temporal abstractions.直观定义和查询一致的多粒度临床时间抽象。
Artif Intell Med. 2012 Feb;54(2):75-101. doi: 10.1016/j.artmed.2011.10.004. Epub 2011 Dec 15.
4
Geo-Social Top- and Skyline Keyword Queries on Road Networks.基于道路网络的地理-社会置顶和天空关键词查询。
Sensors (Basel). 2020 Feb 1;20(3):798. doi: 10.3390/s20030798.
5
A novel subspace outlier detection method by entropy-based clustering algorithm.一种基于熵聚类算法的新型子空间离群点检测方法。
Sci Rep. 2023 Sep 15;13(1):15331. doi: 10.1038/s41598-023-42261-4.
6
Improving biomedical information retrieval by linear combinations of different query expansion techniques.通过不同查询扩展技术的线性组合改进生物医学信息检索。
BMC Bioinformatics. 2016 Jul 25;17 Suppl 7(Suppl 7):238. doi: 10.1186/s12859-016-1092-8.
7
Development of a methodology for the detection of hospital financial outliers using information systems.利用信息系统开发一种检测医院财务异常值的方法。
Int J Health Plann Manage. 2014 Jul-Sep;29(3):e207-32. doi: 10.1002/hpm.2194. Epub 2013 Jun 20.
8
An Ensemble Outlier Detection Method Based on Information Entropy-Weighted Subspaces for High-Dimensional Data.一种基于信息熵加权子空间的高维数据集成离群点检测方法。
Entropy (Basel). 2023 Aug 9;25(8):1185. doi: 10.3390/e25081185.
9
Going Beyond Provenance: Explaining Query Answers with Pattern-based Counterbalances.超越出处:用基于模式的平衡来解释查询答案。
Proc ACM SIGMOD Int Conf Manag Data. 2019 Jun;2019:485-502. doi: 10.1145/3299869.3300066.
10
Temporal event searches based on event maps and relationships.基于事件地图和关系的时间事件搜索。
Appl Soft Comput. 2019 Dec;85:105750. doi: 10.1016/j.asoc.2019.105750. Epub 2019 Sep 25.

引用本文的文献

1
PReP: Path-Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks.PReP:异构信息网络中基于概率视角的路径相关性
KDD. 2017 Aug;2017:425-434. doi: 10.1145/3097983.3097990.
2
Information Spread and Topic Diffusion in Heterogeneous Information Networks.异构信息网络中的信息传播与主题扩散
Sci Rep. 2018 Jun 22;8(1):9549. doi: 10.1038/s41598-018-27385-2.