Suppr超能文献

基于规则表示的新冠病毒19数据中的异常值——局部离群因子(LOF)算法分析

Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm.

作者信息

Brzezińska Agnieszka Nowak, Horyń Czesław

机构信息

University of Silesia in Katowice, Institute of Computer Science, Katowice, Bankowa 12, Poland.

出版信息

Procedia Comput Sci. 2021;192:3010-3019. doi: 10.1016/j.procs.2021.09.073. Epub 2021 Oct 1.

Abstract

The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system.

摘要

本文关注的是在包含新冠19病例数据的基于规则的知识库中检测异常值。作者从通过对知识库中的规则进行聚类,从源数据自动生成基于规则的知识库入手,以优化推理过程,并检测异常规则,从而实现规则组的最优结构。本文提出了一个两阶段的过程,其中在第一阶段,当知识库中存在异常规则时,我们寻找规则簇的最优结构。在第二阶段,我们使用局部异常因子(LOF)算法检测规则中的异常值。然后我们从数据库中消除异常规则,并检查所选的簇质量度量是否对异常值的消除有积极响应,这将表明这些规则被正确地视为异常值。所进行的实验证实了LOF算法和所选簇质量度量在检测非典型规则方面的有效性。检测此类规则可以在知识挖掘中支持知识工程师或领域专家,以提高知识库的完整性,而知识库通常是决策支持系统的基础。

相似文献

3
How the Outliers Influence the Quality of Clustering?异常值如何影响聚类质量?
Entropy (Basel). 2022 Jun 30;24(7):917. doi: 10.3390/e24070917.
8
A Swarm Optimization approach for clinical knowledge mining.基于群集智能优化算法的临床知识挖掘方法
Comput Methods Programs Biomed. 2015 Oct;121(3):137-48. doi: 10.1016/j.cmpb.2015.05.007. Epub 2015 Jun 6.
9
Fast Outlier Detection Using a Grid-Based Algorithm.使用基于网格的算法进行快速离群值检测。
PLoS One. 2016 Nov 10;11(11):e0165972. doi: 10.1371/journal.pone.0165972. eCollection 2016.
10
Qualitative Data Clustering to Detect Outliers.用于检测异常值的定性数据聚类
Entropy (Basel). 2021 Jul 7;23(7):869. doi: 10.3390/e23070869.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验