Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

Hai Duong; Tin Truong; Bac Le

首页> 外文期刊>Engineering Applications of Artificial Intelligence >Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

【24h】

Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

机译：基于扩展修剪条件同时挖掘顺序模式的简洁表示的高效算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The concise representations of sequential patterns, including maximal sequential patterns, closed sequential patterns and sequential generator patterns, play an important role in data mining since they provide several benefits when compared to sequential patterns. One of the most important benefits is that their cardinalities are generally much less than the cardinality of the set of sequential patterns. Therefore, they can be mined more efficiently, use less storage space, and it is easier for users to analyze the information provided by the concise representations. In addition, the set of all maximal sequential patterns can be utilized to recover the complete set of sequential patterns, while closed sequential patterns and sequential generators can be used together to generate non-redundant sequential rules and to quickly recover all sequential patterns and their frequencies. Several algorithms have been proposed to mine the concise representations separately, i.e., each of them has been designed to discover only a type of the concise representation. However, they remain time-consuming and memory intensive tasks. To address this problem, we propose three novel efficient algorithms named FMaxSM, FGenCloSM and MaxGenCloSM to exploit only maximal sequential patterns, to simultaneously mine both the sets of closed sequential patterns and generators, and to discover all three concise representations during the same process. To our knowledge, MaxGenCloSM is the first algorithm for concurrently mining the three concise representations of sequential patterns. The proposed algorithms are based on two novel local pruning strategies called LPMAX and LPMaxGenClo that are designed to prune non-maximal, non-closed and non-generator patterns earlier and more efficiently at two and three successive levels of the prefix tree without subsequence relation checking. Extensive experiments on real-life and synthetic databases show that FMaxSM, FGenCloSM and MaxGenCloSM are up to two orders of magnitude faster than the state-of-the-art algorithms and that the proposed algorithms consume much less memory, especially for low minimum support thresholds and for dense databases.

机译：顺序模式的简洁表示，包括最大顺序模式，闭合顺序模式和顺序生成器模式，在数据挖掘中起着重要作用，因为与顺序模式相比，它们提供了许多好处。最重要的好处之一是它们的基数通常比顺序模式集的基数小得多。因此，可以更有效地挖掘它们，使用更少的存储空间，并且用户更容易分析简明表示形式提供的信息。此外，所有最大顺序模式集均可用于恢复完整的顺序模式集，而封闭顺序模式和顺序生成器可一起使用以生成非冗余顺序规则并快速恢复所有顺序模式及其频率。已经提出了几种算法来分别挖掘简明表示，即，每种算法被设计成仅发现简明表示的类型。但是，它们仍然是耗时且占用大量内存的任务。为了解决这个问题，我们提出了三种名为FMaxSM，FGenCloSM和MaxGenCloSM的新型高效算法，以仅利用最大顺序模式，同时挖掘闭合顺序模式和生成器的集合，并在同一过程中发现所有三个简洁表示。据我们所知，MaxGenCloSM是第一种同时挖掘顺序模式的三个简洁表示的算法。所提出的算法基于称为LPMAX和LPMaxGenClo的两种新颖的本地修剪策略，这些策略被设计为在前缀树的两个和三个连续级别上更早，更有效地修剪非最大，非闭合和非生成器模式，而无需进行子序列关系检查。。在现实生活和综合数据库上进行的大量实验表明，FMaxSM，FGenCloSM和MaxGenCloSM比最新算法快两个数量级，并且所提出的算法消耗的内存少得多，尤其是对于最低最低支持阈值而言和密集数据库。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2018年第1期|197-210|共14页
作者
Hai Duong; Tin Truong; Bac Le;
展开▼
作者单位

Department of Mathematics and Computer Science, University of Dalat, Dalai, Viet Nam,VNU-HCMC, University of Natural Sciences, Ho CM Minh, Viet Nam;

Department of Mathematics and Computer Science, University of Dalat, Dalai, Viet Nam;

VNU-HCMC, University of Natural Sciences, Ho CM Minh, Viet Nam;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Sequential pattern mining; Frequent sequence; Maximal frequent sequence; Frequent closed sequence; Frequent generator sequences; Vertical data format;

机译：顺序模式挖掘;频繁的顺序;最大频繁序列;频繁的关闭序列;频繁的生成器序列;垂直数据格式;

相似文献

外文文献
中文文献
专利

1. C3Ro: An efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data [J] . Abboud Y., Brun A., Boyer A. Expert Systems with Application . 2019,第OCTa期

机译：C3Ro：噪声数据中扩展-闭合的连续鲁棒顺序模式的有效挖掘算法
2. C3Ro: An efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data [J] . Abboud Y., Brun A., Boyer A. Expert systems with applications . 2019,第Octa期

机译：C3RO：嘈杂数据中的扩展闭合连续稳健序列模式的高效挖掘算法
3. A bee colony optimisation algorithm with a sequential-pattern-mining-based pruning strategy for the travelling salesman problem [J] . Choong Shin Siang, Wong Li-Pei, Low Malcolm Yoke Hean, International Journal of Bio-Inspired Computation . 2020,第4期

机译：一种蜜蜂殖民地优化算法，具有序列模式挖掘的旅行推销策略策略问题
4. An Efficient Hash-Tree-Based Algorithm in Mining Sequential Patterns with Topology Constraint [C] . Wenhua Sun, Xiaojuan Wang, Lei Jin IEEE International Conference on High Performance Computing and Communications;IEEE International Conference on Smart City;IEEE International Conference on Data Science and Systems . 2019

机译：一种高效的基于哈希树的拓扑约束挖掘序列算法
5. Extended Kalman filter-based pruning algorithms and several aspects of neural network learning. [D] . Sum, John Pui-Fai. 1998

机译：基于扩展卡尔曼滤波器的修剪算法以及神经网络学习的多个方面。
6. An Efficient Incremental Mining Algorithm for Discovering Sequential Pattern in Wireless Sensor Network Environments [O] . Xin Lyu, Hongxu Ma 2019

机译：在无线传感器网络环境中发现顺序模式的高效增量挖掘算法
7. An efficient ga-based algorithm for mining negative sequential patterns [O] . Zhigang Zheng, Yanchang Zhao, Ziye Zuo, 2010

机译：一种有效的基于ga的挖掘负序模式算法
8. From Malicious Eyes: A Method for Concise Representation of Ad-Hoc Networks and Efficient Attack Survivability Analysis. [R] . Acosta, J. C. 2012

机译：从恶意的眼睛：ad-Hoc网络的简明表示和有效的攻击生存性分析的方法。

Efficient algorithms for simultaneously mining concise representations of sequential patterns based on extended pruning conditions

摘要

著录项

相似文献

相关主题

期刊订阅