首页> 外文期刊>Pattern recognition letters >A novel multi-core algorithm for frequent itemsets mining in data streams
【24h】

A novel multi-core algorithm for frequent itemsets mining in data streams

机译:一种新的多核算法,用于频繁的数据流挖掘

获取原文
获取原文并翻译 | 示例
           

摘要

Data streams are modern data sources that are gaining attention as a consequence of their many practical applications (they can be found in data transmission, eCommerce, and intrusion detection system among others). Nevertheless, the efforts to obtain insights from data streams are limited due to their massive information volume and the time needed to process them. In this paper, a new approach for Frequent Itemsets Mining on data streams based on prefix trees which takes advantage of multi-core systems is proposed. This approach uses the Gearman framework as the interface for multi-core processing, and it allows to exploit their scalability efficiently. Experimental results show that the proposed method obtains the same patterns compared with similar approaches reported in the state-of-the-art and outperforms them concerning the processing time required. Also, it is proved that the proposed method is insensitive to variations in the support threshold value, and its efficiency depends on the size of the transactions and not on the size of the alphabet, which is a significant issue in other Frequent Itemsets Mining algorithms. (C) 2019 Elsevier B.V. All rights reserved.
机译:数据流是现代数据来源,这些数据源是由于它们的许多实际应用的所关注(它们可以在数据传输,电子商务和入侵检测系统中找到)。然而,由于其大量信息量以及处理它们所需的时间,从数据流中获得洞察力的努力受到限制。在本文中,提出了一种基于前缀树的数据流挖掘利用多核系统的数据流的新方法。这种方法使用Gearman Framework作为多核处理的界面,它允许有效地利用它们的可扩展性。实验结果表明,该方法比较了与现有技术中报道的类似方法相同的模式,并优于所需的处理时间。此外,证明了该方法对支持阈值的变化不敏感,其效率取决于交易的大小而不是字母表的大小,这是其他频繁的项目集挖掘算法中的一个重要问题。 (c)2019 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号