首页> 外文会议>Data Mining Workshops, 2009. ICDMW '09 >Efficient Incremental Mining of Qualified Web Traversal Patterns without Scanning Original Databases
【24h】

Efficient Incremental Mining of Qualified Web Traversal Patterns without Scanning Original Databases

机译:无需扫描原始数据库即可有效地增量挖掘合格的Web遍历模式

获取原文

摘要

Discovering web traversal patterns is an important issue in web usage mining with various applications like navigation prediction and improvement of website management. Since web data grows so rapidly and some web data may become out of date over time, we need not only consider the new data but also delete the old one to re-mine new web traversal patterns. To reduce the overhead of re-mining the web traversal patterns from the whole web data, an incremental mining approach is needed by using the previous mining results and computing new patterns just from the inserted or deleted part of the web data. In this paper, we propose an efficient incremental web traversal pattern mining algorithm named IncWTP_PLM (Incremental mining of Web Traversal Patterns by using Projected-database Link Matrix). Meanwhile, a special data structure named Projected-database Link Matrix is proposed to avoid scanning original database. Besides, the website structure is also considered in IncWTP_PLM such that each web traversal pattern discovered is qualified. The experimental results show that our algorithm outperforms other approaches substantially in terms of efficiency.
机译:发现Web遍历模式是Web使用挖掘中的一个重要问题,它具有各种应用程序,例如导航预测和网站管理的改进。由于Web数据增长如此之快,并且随着时间的推移某些Web数据可能会过时,因此,我们不仅需要考虑新数据,还需要删除旧数据以重新挖掘新的Web遍历模式。为了减少从整个Web数据重新挖掘Web遍历模式的开销,需要一种增量挖掘方法,方法是使用以前的挖掘结果并仅从Web数据的插入或删除部分中计算新的模式。在本文中,我们提出了一种有效的增量式Web遍历模式挖掘算法IncWTP_PLM(使用投影数据库链接矩阵对Web遍历模式进行增量挖掘)。同时,为避免扫描原始数据库,提出了一种名为“投影数据库链接矩阵”的特殊数据结构。此外,在IncWTP_PLM中还考虑了网站结构,以使发现的每个Web遍历模式都合格。实验结果表明,我们的算法在效率方面明显优于其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号