首页> 外文期刊>Engineering Applications of Artificial Intelligence >An efficient approach for mining sequential patterns using multiple threads on very large databases
【24h】

An efficient approach for mining sequential patterns using multiple threads on very large databases

机译:在大型数据库上使用多个线程挖掘顺序模式的有效方法

获取原文
获取原文并翻译 | 示例

摘要

Sequential pattern mining (SPM) plays an important role in data mining, with broad applications such as in financial markets, education, medicine, and prediction. Although there are many efficient algorithms for SPM, the mining time is still high, especially for mining sequential patterns from huge databases, which require the use of a parallel technique. In this paper, we propose a parallel approach named MCM-SPADE (Multiple threads CM-SPADE), for use on a multi-core processor system as a multi-threading technique for SPM with very large database, to enhance the performance of the previous methods SPADE and CM-SPADE. The proposed algorithm uses the vertical data format and a data structure named CMAP (Co-occurrence MAP) for storing co-occurrence information. Based on the data structure CMAP, the proposed algorithm performs early pruning of the candidates to reduce the search space and it partitions the related tasks to each processor core by using the divide-and-conquer property. The proposed algorithm also uses dynamic scheduling to avoid task idling and achieve load balancing between processor cores. The experimental results show that MCM-SPADE attains good parallelization efficiency on various input databases.
机译:顺序模式挖掘(SPM)在数据挖掘中起着重要作用,在金融市场,教育,医学和预测等领域有着广泛的应用。尽管有许多用于SPM的有效算法,但是挖掘时间仍然很高,尤其是对于从大型数据库中挖掘顺序模式的挖掘,这需要使用并行技术。在本文中,我们提出了一种名为MCM-SPADE(多线程CM-SPADE)的并行方法,用于多核处理器系统上,作为具有非常大的数据库的SPM的多线程技术,以增强以前的性能。方法SPADE和CM-SPADE。所提出的算法使用垂直数据格式和名为CMAP(共现MAP)的数据结构来存储共现信息。该算法基于数据结构CMAP,对候选数据进行了早期删减,以减少搜索空间,并利用分而治之的特性将相关任务划分到每个处理器核心。所提出的算法还使用动态调度来避免任务空转并实现处理器内核之间的负载平衡。实验结果表明,MCM-SPADE在各种输入数据库上均具有良好的并行化效率。

著录项

  • 来源
  • 作者单位

    Center for Applied Information Technology, Ton Duc Thang University,Faculty of Information Technology, Ton Duc Thang University;

    Department of Computing and Computer Services, Ton Duc Thang University,Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava;

    Department of Computing and Computer Services, Ton Duc Thang University,Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava;

    Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH);

    Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH),College of Electronics and Information Engineering, Sejong University;

    Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Sequential patterns; Multi-core processors; Multi-threading; Early pruning;

    机译:顺序模式;多核处理器;多线程;早期修剪;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号