Mahmood Sajid, Shahbaz Muhammad, Guergachi Aziz
Department of Computer Science & Engineering, University of Engineering & Technology, Lahore, Pakistan ; Al-Khawarizmi Institute of Computer Sciences, UET, Lahore, Pakistan.
Department of Computer Science & Engineering, University of Engineering & Technology, Lahore, Pakistan.
ScientificWorldJournal. 2014;2014:973750. doi: 10.1155/2014/973750. Epub 2014 May 18.
Association rule mining research typically focuses on positive association rules (PARs), generated from frequently occurring itemsets. However, in recent years, there has been a significant research focused on finding interesting infrequent itemsets leading to the discovery of negative association rules (NARs). The discovery of infrequent itemsets is far more difficult than their counterparts, that is, frequent itemsets. These problems include infrequent itemsets discovery and generation of accurate NARs, and their huge number as compared with positive association rules. In medical science, for example, one is interested in factors which can either adjudicate the presence of a disease or write-off of its possibility. The vivid positive symptoms are often obvious; however, negative symptoms are subtler and more difficult to recognize and diagnose. In this paper, we propose an algorithm for discovering positive and negative association rules among frequent and infrequent itemsets. We identify associations among medications, symptoms, and laboratory results using state-of-the-art data mining technology.
关联规则挖掘研究通常聚焦于由频繁出现的项集生成的正关联规则(PAR)。然而,近年来,有大量研究致力于寻找有趣的低频项集,以发现负关联规则(NAR)。低频项集的发现比其对应物(即频繁项集)要困难得多。这些问题包括低频项集的发现、准确负关联规则的生成,以及与正关联规则相比其数量巨大。例如,在医学中,人们关注那些能够判定疾病存在或排除其可能性的因素。明显的阳性症状通常很显著;然而,阴性症状更微妙,更难识别和诊断。在本文中,我们提出了一种在频繁和低频项集之间发现正、负关联规则的算法。我们使用先进的数据挖掘技术识别药物、症状和实验室结果之间的关联。