首页> 外文期刊>Future generation computer systems >Efficient algorithms for mining clickstream patterns using pseudo-IDLists
【24h】

Efficient algorithms for mining clickstream patterns using pseudo-IDLists

机译:使用伪IDList挖掘点击流模式的高效算法

获取原文
获取原文并翻译 | 示例

摘要

Sequential pattern mining is an important task in data mining. Its subproblem, clickstream pattern mining, is starting to attract more research due to the growth of the Internet and the need to analyze online customer behaviors. To date, only few works are dedicately proposed for the problem of mining clickstream patterns. Although one approach is to use the general algorithms for sequential pattern mining, those algorithms' performance may suffer and the resources needed are more than would be necessary with a dedicated method for mining clickstreams. In this paper, we present pseudo-IDList, a novel data structure that is more suitable for clickstream pattern mining. Based on this structure, a vertical format algorithm named CUP (Clickstream pattern mining Using Pseudo-IDList) is proposed. Furthermore, we propose a pruning heuristic named DUB (Dynamic intersection Upper Bound) to improve our proposed algorithm. Four real-life clickstream databases are used for the experiments and the results show that our proposed methods are effective and efficient regarding runtimes and memory consumption.
机译:顺序模式挖掘是数据挖掘中的重要任务。由于Internet的增长以及分析在线客户行为的需求,其子问题clickstream模式挖掘已开始吸引更多的研究。迄今为止,针对挖掘点击流模式的问题,仅专门提出了几篇著作。尽管一种方法是使用常规算法进行顺序模式挖掘,但是这些算法的性能可能会受到影响,并且所需资源比使用专门的挖掘点击流方法所需的资源要多。在本文中,我们提出了伪IDList,这是一种更适合单击流模式挖掘的新颖数据结构。基于这种结构,提出了一种垂直格式算法,称为CUP(使用伪IDList的Clickstream模式挖掘)。此外,我们提出了一种名为DUB(动态交集上限)的修剪启发式算法,以改进我们提出的算法。四个真实的点击流数据库用于实验,结果表明,我们提出的方法在运行时和内存消耗方面是有效的。

著录项

  • 来源
    《Future generation computer systems》 |2020年第6期|18-30|共13页
  • 作者单位

    Institute of Research and Development Duy Tan University Da Nang 550000 Viet Nam;

    School of Computer Science and Engineering International University Ho Chi Minh City Viet Nam Vietnam National University Ho Chi Minh City Viet Nam;

    Faculty of Information Technology Ho Chi Minh City University of Technology (HUTECH) Ho Chi Minh City Viet Nam;

    Department of Computer Engineering Sejong University Seoul Republic of Korea;

    Faculty of Applied Informatics Tomas Bata University in Zlin Nam. T.G. Masaryka 5555 Zlin Czech Republic;

    Department of Computer Science and Information Engineering National University of Kaohsiung Kaohsiung Taiwan;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Sequential pattern mining; Clickstream pattern mining; Candidate pruning; Vertical format;

    机译:顺序模式挖掘;点击流模式挖掘;候选修剪;垂直格式;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号