首页> 外文期刊>Future generation computer systems >A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments
【24h】

A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments

机译:一种快速且资源有效的挖掘算法,用于发现分布式计算环境中的频繁模式

获取原文
获取原文并翻译 | 示例
           

摘要

The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden information from such data. Data mining is mainly composed of association rules mining, sequent pattern mining, classification and clustering. Association rules mining has attracted significant attention and been successfully applied to various fields. Although the past studies can effectively discover frequent patterns to deduce association rules, execution efficiency is still a critical problem. To speed up execution, many methods using parallel and distributed computing technology have been proposed in recent years. Most of the past studies focused on parallelizing the workload in a high end machine or in distributed computing environments like grid or cloud computing systems; however, very few of them discuss how to efficiently determine the appropriate number of computing nodes, considering execution efficiency and load balancing. An intuition is that execution speed is proportional to the number of computing nodes-that is, more the number of computing nodes, faster is the execution speed. However, this is incorrect for such algorithms because of the inherently algorithmic design. Allocating too many computing nodes can lead to high execution time. In addition to the execution inefficiency, inappropriate resource allocation is a waste of computing power and network bandwidth. At the same time, load cannot be effectively distributed if there are too few nodes allocated. In this paper, we propose a fast, load balancing and resource efficient algorithm named FLR-Mining for discovering frequent patterns in distributed computing systems. FLR-Mining is capable of determining the appropriate number of computing nodes automatically and achieving better load balancing as compared with existing methods. Through empirical evaluation, FLR-Mining is shown to deliver excellent performance in terms of execution efficiency and load balancing.
机译:电子技术的进步使我们能够从各种设备中收集日志。此类日志需要详细分析才能广泛使用。数据挖掘是一种广泛用于从此类数据中提取隐藏信息的技术。数据挖掘主要包括关联规则挖掘,顺序模式挖掘,分类和聚类。关联规则挖掘已引起广泛关注,并已成功应用于各个领域。尽管过去的研究可以有效地发现频繁的模式来推断关联规则,但是执行效率仍然是一个关键问题。为了加快执行速度,近年来已经提出了许多使用并行和分布式计算技术的方法。过去的大多数研究都集中在并行化高端计算机或分布式计算环境(如网格或云计算系统)中的工作负载。但是,他们中很少有人考虑执行效率和负载平衡,讨论如何有效确定适当数量的计算节点。直觉是执行速度与计算节点的数量成正比,也就是说,计算节点的数量越多,执行速度就越快。然而,由于固有的算法设计,对于这种算法这是不正确的。分配过多的计算节点可能导致执行时间过长。除了执行效率低下之外,不适当的资源分配还浪费了计算能力和网络带宽。同时,如果分配的节点太少,负载将无法有效分配。在本文中,我们提出了一种称为FLR-Mining的快速,负载平衡和资源高效的算法,用于发现分布式计算系统中的频繁模式。与现有方法相比,FLR-Mining能够自动确定适当数量的计算节点,并实现更好的负载平衡。通过经验评估,FLR-Mining在执行效率和负载平衡方面显示出出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号