首页> 外文会议>ACMKDD International Conference on Knowledge Discovery and Data Mining;KDD 2008 >Constructing Comprehensive Summaries of Large Event Sequences
【24h】

Constructing Comprehensive Summaries of Large Event Sequences

机译:构造大型事件序列的综合摘要

获取原文

摘要

Event sequences capture system and user activity over time. Prior research on sequence mining has mostly focused on discovering local patterns. Though interesting, these patterns reveal local associations and fail to give a comprehensive summary of the entire event sequence. Moreover, the number of patterns discovered can be large. In this paper, we take an alternative approach and build short summaries that describe the entire sequence, while revealing local associations among events.We formally define the summarization problem as an optimization problem that balances between shortness of the summary and accuracy of the data description. We show that this problem can be solved optimally in polynomial time by using a combination of two dynamic-programming algorithms. We also explore more efficient greedy alternatives and demonstrate that they work well on large datasets. Experiments on both synthetic and real datasets illustrate that our algorithms are efficient and produce high-quality results, and reveal interesting local structures in the data.
机译:事件序列可捕获一段时间内的系统和用户活动。先前关于序列挖掘的研究主要集中在发现局部模式上。尽管很有趣,但是这些模式揭示了局部关联,并且无法给出整个事件序列的全面摘要。此外,发现的模式数量可能很大。在本文中,我们采用了另一种方法,并构建了描述整个序列的简短摘要,同时揭示了事件之间的局部关联。 我们正式将摘要问题定义为在摘要的简短程度与数据描述的准确性之间取得平衡的优化问题。我们表明,通过结合使用两种动态编程算法,可以在多项式时间内以最佳方式解决此问题。我们还将探索更有效的贪婪替代方案,并证明它们在大型数据集上也能很好地工作。在合成数据集和真实数据集上进行的实验表明,我们的算法高效且可产生高质量的结果,并揭示了数据中有趣的局部结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号