Jiaozuo Normal College, Jiaozuo 454000, China.
Comput Intell Neurosci. 2022 May 18;2022:7022168. doi: 10.1155/2022/7022168. eCollection 2022.
In the discipline of data mining, association rule mining is an important study topic that focuses on discovering the relationships between database attributes. The maximum frequent itemset comprises the information of all frequent itemsets, which is one of the important difficulties in mining association rules, and certain data mining applications just need to mine the maximum frequent itemsets. As a result, analyzing the maximum frequent itemset mining technique is practical. Considering this, the research introduces FP-MFIA, a new maximum frequent itemset mining approach based on the FP-tree, which is inspired by the data structure of the frequent pattern tree and the idea that the maximum frequent itemset implies all frequent itemsets. First, the FP-MFIA constructs a one-way FP-tree structure, which only has pointers from the root to the leaves, so that only two scans of the FP-tree are required by the FP-MFIA. On the other hand, it redefines a data storage structure MFI-list for maximum frequent itemsets. It can quickly release unnecessary nodes in the FP-tree after scanning it. In this way, not only the information required by the maximum frequent itemsets can be quickly mined but also the space required for storing the maximum frequent itemsets can be reduced, which greatly improves the mining efficiency. Finally, experiments were conducted to compare the mining efficiency of the novel FP-MFIA algorithm to the IDMFIA and DMFIA algorithms. We can see from the findings that the FP-MFIA algorithm is more efficient than the other two techniques.
在数据挖掘领域,关联规则挖掘是一个重要的研究课题,专注于发现数据库属性之间的关系。最大频繁项集包含了所有频繁项集的信息,是挖掘关联规则的重要难点之一,而某些数据挖掘应用只需要挖掘最大频繁项集。因此,分析最大频繁项集挖掘技术是很有实际意义的。有鉴于此,本研究提出了一种新的基于 FP 树的最大频繁项集挖掘方法 FP-MFIA,该方法受到频繁模式树的数据结构和最大频繁项集包含所有频繁项集的思想的启发。首先,FP-MFIA 构建了一个单向 FP 树结构,该结构只从根节点指向叶子节点,因此 FP-MFIA 只需要两次扫描 FP 树。另一方面,它重新定义了一个用于最大频繁项集的存储结构 MFI-list。它可以在扫描完 FP 树后快速释放不必要的节点。这样,不仅可以快速挖掘最大频繁项集所需的信息,还可以减少存储最大频繁项集所需的空间,从而大大提高了挖掘效率。最后,通过实验比较了新的 FP-MFIA 算法与 IDMFIA 和 DMFIA 算法的挖掘效率。从结果可以看出,FP-MFIA 算法比其他两种技术更高效。