首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >A novel mapreduce algorithm for distributed mining of sequential patterns using co-occurrence information
【24h】

A novel mapreduce algorithm for distributed mining of sequential patterns using co-occurrence information

机译:一种使用共发生信息的分布式挖掘的新型MapReduce算法

获取原文
获取原文并翻译 | 示例
           

摘要

Sequential Pattern Mining (SPM) problem is much studied and extended in several directions. With the tremendous growth in the size of datasets, traditional algorithms are not scalable. In order to solve the scalability issue, recently few researchers have developed distributed algorithms based on MapReduce. However, the existing MapReduce algorithms require multiple rounds of MapReduce, which increases communication and scheduling overhead. Also, they do not address the issue of handling long sequences. They generate huge number of candidate sequences that do not appear in the input database and increases the search space. This results in more number of candidate sequences for support counting. Our algorithm is a two phase MapReduce algorithm that generates the promising candidate sequences using the pruning strategies. It also reduces the search space and thus the support computation is effective. We make use of the item co-occurrence information and the proposed Sequence Index List (SIL) data structure helps in computing the support at fast. The experimental results show that the proposed algorithm has better performance over the existing MapReduce algorithms for the SPM problem.
机译:序列模式挖掘(SPM)问题很多,沿几个方向延伸。随着数据集的大小的巨大增长,传统算法不可扩展。为了解决可扩展性问题,最近少数研究人员基于MapReduce开发了分布式算法。但是,现有MapReduce算法需要多轮MapReduce,这增加了通信和调度开销。此外,他们没有解决处理长序列的问题。它们生成了不出现在输入数据库中的大量候选序列,并增加了搜索空间。这导致更多候选序列进行支持计数。我们的算法是一种两相MapReduce算法,使用修剪策略生成有前途的候选序列。它还减少了搜索空间,因此支持计算是有效的。我们利用项目共同发生信息,并且所提出的序列索引列表(SIL)数据结构有助于快速计算支撑。实验结果表明,该算法对SPM问题的现有MapReduce算法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号