【24h】

A Quick Look at Methods for Mining Long Subsequences

机译:快速查看用于挖掘长时间的方法

获取原文

摘要

Pattern discovery, or the search for frequently occurring subsequences (called sequential patterns) in sequences, is a well-known data-mining task. Sequences of events occur naturally in many domains. We address and abstract version of the problem of finding frequent sequences of page accesses in a log file by considering the problem of finding frequent subsequences in a sequence dataset. In the abstract problem, we use the 26 uppercase letters to represent the possible web pages, and examine the problem of finding frequently occurring subsequences of items in a very long sequence. The particular problem studied is to find all frequently occurring substrings of length K or less in a very long string. The advantage of Heuristic Depth-first (HDF) algorithm based on the Depth-First (DF) algorithm is explained by comparing with Breadth-First (BF) algorithm.
机译:模式发现,或搜索序列中经常发生的子序列(称为顺序模式)是众所周知的数据挖掘任务。事件的序列在许多域内自然发生。我们通过考虑在序列数据集中查找频繁的子项的问题来解决日志文件中频繁页面访问常见序列访问的问题的地址和抽象。在抽象问题中,我们使用26个大写字母来表示可能的网页,并以非常长的序列查找商品的经常发生后续的问题。研究的特定问题是在非常长的串中找到长度k或更小的所有经常发生的子串。通过与广度第一(BF)算法进行比较,解释了基于深度第一(DF)算法的启发式深度第一(HDF)算法的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号