Suppr超能文献

通过吉布斯采样增强大型数据集中的关联规则挖掘。

Boosting association rule mining in large datasets via Gibbs sampling.

作者信息

Qian Guoqi, Rao Calyampudi Radhakrishna, Sun Xiaoying, Wu Yuehua

机构信息

School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia;

Department of Biostatistics, University at Buffalo, The State University of New York, Buffalo, NY 14221-3000; CRRAO Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad-500046, India;

出版信息

Proc Natl Acad Sci U S A. 2016 May 3;113(18):4958-63. doi: 10.1073/pnas.1604553113. Epub 2016 Apr 18.

Abstract

Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this paper, we develop a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. Also a general rule importance measure is proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. In the simulation study and a real genomic data example, we show how to boost association rule mining by an integrated use of the stochastic search and the Apriori algorithm.

摘要

当前用于从交易数据中挖掘关联规则的算法大多是确定性的和枚举式的。如果不采取措施来限制搜索空间,即使对于挖掘一个只包含几百个交易项的数据集,它们在计算上也可能难以处理。在本文中,我们开发了一种由吉布斯采样引发的随机搜索过程,以便从项集空间中随机采样关联规则,并从由该采样生成的简化交易数据集中执行规则挖掘。此外,还提出了一种通用的规则重要性度量来指导随机搜索,使得由于随机生成的关联规则构成一个遍历马尔可夫链,在极限情况下可以从简化数据集中以概率1发现项集空间中总体最重要的规则。在模拟研究和一个真实基因组数据示例中,我们展示了如何通过随机搜索和Apriori算法的综合使用来促进关联规则挖掘。

相似文献

1
Boosting association rule mining in large datasets via Gibbs sampling.通过吉布斯采样增强大型数据集中的关联规则挖掘。
Proc Natl Acad Sci U S A. 2016 May 3;113(18):4958-63. doi: 10.1073/pnas.1604553113. Epub 2016 Apr 18.
2
Improving rule-based classification using Harmony Search.使用和声搜索改进基于规则的分类。
PeerJ Comput Sci. 2019 Nov 18;5:e188. doi: 10.7717/peerj-cs.188. eCollection 2019.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验