College of Information Engineering, Shaanxi Institute of International Trade & Commerce, Xi'an 712046, China.
Comput Intell Neurosci. 2022 Aug 23;2022:8526256. doi: 10.1155/2022/8526256. eCollection 2022.
The Apriori algorithm in association rules is the main algorithm used in the treatment and prevention of chronic diseases in data mining, and the algorithm in the current stage of China's medical field of association between chronic diseases has some problems, such as the need to scan the transaction database of cases several times, producing a large data set and more redundant rules. To address the above problems, a data mining algorithm of association rules combining clustering matrix and pruning strategy is proposed, which improves the algorithm by using the clustering matrix method to compress the stored transaction database and introducing the prepruning and postpruning strategy methods on the basis of adding constraint conditions. The experimental results show that the optimization algorithm has unique advantages in reducing the number of database scans and the number of candidate item sets generated and ultimately greatly reduces the running time and I/O load of the algorithm, and the running efficiency of the algorithm is greatly improved.
关联规则中的 Apriori 算法是数据挖掘中治疗和预防慢性病的主要算法,当前中国慢性病医学领域的关联算法存在一些问题,例如需要多次扫描病例的交易数据库,产生大数据集和更多冗余规则。为了解决上述问题,提出了一种结合聚类矩阵和剪枝策略的数据挖掘关联规则算法,该算法通过使用聚类矩阵方法压缩存储的交易数据库,并在添加约束条件的基础上引入预剪枝和后剪枝策略方法,对算法进行改进。实验结果表明,优化算法在减少数据库扫描次数和候选项目集生成数量方面具有独特的优势,最终大大降低了算法的运行时间和 I/O 负载,极大地提高了算法的运行效率。