首页> 外文期刊>ACM transactions on knowledge discovery from data >Efficient Mining of Outlying Sequence Patterns for Analyzing Outlierness of Sequence Data
【24h】

Efficient Mining of Outlying Sequence Patterns for Analyzing Outlierness of Sequence Data

机译:高效挖掘偏远序列模式,用于分析序列数据的差异

获取原文
获取原文并翻译 | 示例

摘要

Recently, a lot of research work has been proposed in different domains to detect outliers and analyze the outlierness of outliers for relational data. However, while sequence data is ubiquitous in real life, analyzing the outlierness for sequence data has not received enough attention. In this article, we study the problem of mining outlying sequence patterns in sequence data addressing the question: given a query sequence s in a sequence dataset D, the objective is to discover sequence patterns that will indicate the most unusualness (i.e., outlierness) of s compared against other sequences. Technically, we use the rank defined by the average probabilistic strength (aps) of a sequence pattern in a sequence to measure the outlierness of the sequence. Then a minimal sequence pattern where the query sequence is ranked the highest is defined as an outlying sequence pattern. To address the above problem, we present OSPMiner, a heuristic method that computes aps by incorporating several pruning techniques. Our empirical study using both real and synthetic data demonstrates that OSPMiner is effective and efficient.
机译:最近,在不同域中提出了许多研究工作来检测异常值并分析关系数据的异常值的差异。但是,虽然序列数据在现实生活中普遍存在,但分析了序列数据的差异没有得到足够的关注。在本文中,我们研究了序列数据中挖掘偏离序列模式的问题:给定序列数据集D中的查询序列S,目标是发现将指示最不寻常(即,差异性)的序列模式s与其他序列进行比较。从技术上讲,我们使用序列模式的平均概率强度(APS)定义的等级以序列来测量序列的差异。然后将查询序列排名最高的最小序列模式被定义为偏远的序列模式。为了解决上述问题,我们呈现OSPminer,通过结合多种修剪技术来计算APS的启发式方法。我们使用真实和合成数据的实证研究表明,OSPminer是有效和有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号