首页> 外文期刊>International journal of computer science and network security >Cluster Based Partition Approach for Mining Frequent Itemsets
【24h】

Cluster Based Partition Approach for Mining Frequent Itemsets

机译:基于聚类的频繁项集划分方法

获取原文
获取原文并翻译 | 示例

摘要

Data Mining is the process of extracting interesting and previously unknown patterns and correlations form huge data stored in databases. Association rule mining- a descriptive mining technique of data mining, is the process of discovering items or literals which tend to occur together in transactions. As the data to be mined is large, the time taken for accessing data is considerable. This paper describes a new approach for association mining, based on Master-Slave architecture. It uses hybrid approach - a combination of bottom up and top down approaches for searching frequent itemsets. The Apriori algorithm performs well only when the frequent itemsets are short. Algorithms with top down approach are suitable for long frequent itemsets. This new master slave architecture based algorithm combines both bottom-up and top-down approach. The Prime number based representation consumes less memory as each transaction is replaced with the product of the assigned prime numbers of their items. It reduces the time taken to determine the support count of the itemsets. The Prime number based representation offers the flexibility for testing the validity of metarules and provides reduction in the data complexity.
机译:数据挖掘是从存储在数据库中的巨大数据中提取出有趣的,以前未知的模式和相关性的过程。关联规则挖掘-一种数据挖掘的描述性挖掘技术,是发现倾向于在交易中同时出现的项目或文字的过程。由于要挖掘的数据很大,因此访问数据所花费的时间相当可观。本文介绍了一种基于Master-Slave体系结构的关联挖掘新方法。它使用混合方法-自下而上和自上而下方法的组合来搜索频繁的项目集。仅当频繁项集较短时,Apriori算法才能发挥出色的性能。自顶向下方法的算法适用于长期频繁的项目集。这种基于主从架构的新算法结合了自下而上和自上而下的方法。基于质数的表示形式消耗的内存更少,因为每个交易都被为其项目分配的质数的乘积所代替。它减少了确定项目集支持计数所需的时间。基于素数的表示形式为测试元规则的有效性提供了灵活性,并降低了数据复杂性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号