Brzezińska Agnieszka Nowak, Horyń Czesław
University of Silesia in Katowice, Institute of Computer Science, Katowice, Bankowa 12, Poland.
Procedia Comput Sci. 2021;192:3010-3019. doi: 10.1016/j.procs.2021.09.073. Epub 2021 Oct 1.
The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system.
本文关注的是在包含新冠19病例数据的基于规则的知识库中检测异常值。作者从通过对知识库中的规则进行聚类,从源数据自动生成基于规则的知识库入手,以优化推理过程,并检测异常规则,从而实现规则组的最优结构。本文提出了一个两阶段的过程,其中在第一阶段,当知识库中存在异常规则时,我们寻找规则簇的最优结构。在第二阶段,我们使用局部异常因子(LOF)算法检测规则中的异常值。然后我们从数据库中消除异常规则,并检查所选的簇质量度量是否对异常值的消除有积极响应,这将表明这些规则被正确地视为异常值。所进行的实验证实了LOF算法和所选簇质量度量在检测非典型规则方面的有效性。检测此类规则可以在知识挖掘中支持知识工程师或领域专家,以提高知识库的完整性,而知识库通常是决策支持系统的基础。