...
首页> 外文期刊>International journal of soft computing >A Novel Scheme for Candidate Generation for Mining Frequent Patterns
【24h】

A Novel Scheme for Candidate Generation for Mining Frequent Patterns

机译:一种用于挖掘频繁模式的候选者生成的新方案

获取原文
   

获取外文期刊封面封底 >>

       

摘要

With the explosive growth of data, mining information and knowledge from large databases has become one of the major challenges for data management and mining community. Data mining is the extraction of hidden unpredictive information from large databases. It is concerned with the analysis of data and finding patterns that exist in large databases but are hidden among the vast amount of data. Association rules are one of the most popular data mining techniques. The first step in mining association rules is mining frequent patterns. They are particularly useful for discovering relationships among data in huge databases. This study proposes a novel scheme for candidate generation that generates all the candidate item sets in three iterations. A new algorithm called AR-mine for association rule mining is also presented that uses the proposed scheme for candidate generation. A distinct feature of this algorithm is that a candidate item set is generated only when it actually encounters an occurrence of that set in the database. Another important feature is that it requires only three scans of the database. A simple hash table is used to store the candidate item sets, which speeds up the searching process. Our experiments with synthetic data sets and real life data sets show that AR-mine performs better than apriori, a well known and widely used algorithm for association rule mining.
机译:随着数据的爆炸性增长,从大型数据库中挖掘信息和知识已成为数据管理和挖掘社区的主要挑战之一。数据挖掘是从大型数据库中提取隐藏的不可预测信息。它关心的是数据分析和查找大型数据库中存在但隐藏在海量数据中的模式。关联规则是最流行的数据挖掘技术之一。挖掘关联规则的第一步是挖掘频繁模式。它们对于发现大型数据库中数据之间的关系特别有用。这项研究提出了一种新的候选生成方案,该方案可以在三个迭代中生成所有候选项目集。还提出了一种新的称为AR-mine的关联规则挖掘算法,该算法将所提出的方案用于候选者生成。该算法的显着特征是,仅当候选项目集实际上在数据库中遇到该候选项目集时,才会生成该候选项目集。另一个重要的功能是它只需要对数据库进行三次扫描。一个简单的哈希表用于存储候选项目集,从而加快了搜索过程。我们使用合成数据集和现实生活数据集进行的实验表明,AR-mine的性能要优于apriori(一种众所周知且广泛使用的关联规则挖掘算法)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号