首页> 外文期刊>Journal of computational and theoretical nanoscience >Distributed Association Rule Mining with Load Balancing in Grid Environment
【24h】

Distributed Association Rule Mining with Load Balancing in Grid Environment

机译:分布式关联规则挖掘在网格环境中的负载平衡

获取原文
获取原文并翻译 | 示例
           

摘要

The aim is to develop and speed up the process of frequent pattern mining as well as calculate and distribute the workload using a Distributed Parallel Apriori (DPA) algorithm in a Grid Computing Environment. Apriori TID is used for mining the frequent itemset and generating Association Rules. Rules that represent an association between the values of certain attributes and those of others are called association rules and extractions of such rules are termed as Association Rule Mining. The work aims at providing a parallel and distributed rule mining algorithm. It mines frequent item sets at a faster rate using their sparse matrix and provide the best rules based on various rule interestingness measures like Pearson coefficient, Chi square, etc. Various performance measurement was done based on how the system behaves with respect to time versus minimum support, Itemset length, transactions and memory usage when the Itemset is lengthier. The work has been implemented using the R language in R Studio which comes with R Data mining Toolkit. A Grid environment has been framed with four clusters using GridR of R Tool and the DPA has been tested against various other existing pattern mining algorithms with various data sets.
机译:目的是开发和加快频繁模式挖掘的过程以及在网格计算环境中使用分布式并行APRIORI(DPA)算法计算和分配工作负载。 Apriori TID用于挖掘频繁的项目集和生成关联规则。表示某些属性值与其他属性值之间的关联的规则称为关联规则以及此类规则的提取被称为关联规则挖掘。该工作旨在提供一个平行和分布的规则挖掘算法。它频繁使用它们的稀疏矩阵频繁的项目集,并根据Pearson系数,Chi广场等各种规则有趣措施提供最佳规则。根据系统如何与最小值相比,进行各种性能测量。 itemset延长时支持,项目集长度,事务和内存使用情况。使用R Studio中的R语言来实现该工作,R数据挖掘工具包。网格环境已经使用R工具Gridr的四个集群框架,并且DPA已经针对各种数据集进行了各种其他现有模式挖掘算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号