首页> 外文期刊>Journal of computer sciences >Fast Algorithms for Discovering Sequential Patterns in Massive Datasets
【24h】

Fast Algorithms for Discovering Sequential Patterns in Massive Datasets

机译:在海量数据集中发现顺序模式的快速算法

获取原文
获取原文并翻译 | 示例
           

摘要

Problem statement: Sequential pattern mining is one of the specific data mining tasks, particularly from retail data. The task is to discover all sequential patterns with a user-specified minimum support, where support of a pattern is the number of data-sequences that contain the pattern. Approach: To find a sequence patterns variety of algorithm like AprioriAll and Generalized Sequential Patterns (GSP) were there. We present fast and efficient algorithms called AprioriAHSID and GSPSID for mining sequential patterns that were fundamentally different from known algorithms. Results: The proposed algorithm had been implemented and compared with AprioriAll and Generalized Sequential Patterns (GSP). Its performance was studied on an experimental basis. We combined the AprioriAHSID algorithm with AprioriAll algorithm into a Hybrid algorithm, called AprioriAll Hybrid. Conclusion: Implementation shows that the execution time of the algorithm to find sequential pattern depends on total no of candidates generated at each level and the time taken to scan the database. Our performance study shows that the proposed algorithms have an excellent performance over the best existing algorithms.
机译:问题陈述:顺序模式挖掘是特定的数据挖掘任务之一,尤其是零售数据。任务是在用户指定的最低支持下发现所有顺序模式,其中对模式的支持是包含该模式的数据序列的数量。方法:要找到一种序列模式,可以使用各种算法,例如AprioriAll和广义序列模式(GSP)。我们提出了称为AprioriAHSID和GSPSID的快速高效的算法,用于挖掘与已知算法根本不同的顺序模式。结果:该算法已经实现,并与AprioriAll和广义序列模式(GSP)进行了比较。在实验的基础上对其性能进行了研究。我们将AprioriAHSID算法和AprioriAll算法合并为一个混合算法,称为AprioriAll Hybrid。结论:实现表明,算法查找顺序模式的执行时间取决于每个级别生成的候选总数以及扫描数据库所花费的时间。我们的性能研究表明,与现有最佳算法相比,所提出的算法具有出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号