首页> 外文期刊>Expert Systems with Application >Closed frequent similar pattern mining: Reducing the number of frequent similar patterns without information loss
【24h】

Closed frequent similar pattern mining: Reducing the number of frequent similar patterns without information loss

机译:封闭式频繁相似模式挖掘:减少频繁相似模式的数量而不会丢失信息

获取原文
获取原文并翻译 | 示例

摘要

Frequent pattern mining is considered a key task to discover useful information. Despite the quality of solutions given by frequent pattern mining algorithms, most of them face the challenge of how to reduce the number of frequent patterns without information loss. Frequent itemset mining addresses this problem by discovering a reduced set of frequent itemsets, named closed frequent itemsets, from which the entire frequent pattern set can be recovered. However, for frequent similar pattern mining, where the number of patterns is even larger than for Frequent itemset mining, this problem has not been addressed yet. In this paper, we introduce the concept of closed frequent similar pattern mining to discover a reduced set of frequent similar patterns without information loss. Additionally, a novel closed frequent similar pattern mining algorithm, named CFSP-Miner, is proposed. The algorithm discovers frequent patterns by traversing a tree that contains all the closed frequent similar patterns. To do this efficiently, several lemmas to prune the search space are introduced and proven. The results show that CFSP-Miner is more efficient than the state-of-the-art frequent similar pattern mining algorithms, except in cases where the number of frequent similar patterns and closed frequent similar patterns are almost equal. However, CFSP-Miner is able to find the closed similar patterns, yielding a reduced size of the discovered frequent similar pattern set without information loss. Also, CFSP-Miner shows good scalability while maintaining an acceptable runtime performance. (C) 2017 Elsevier Ltd. All rights reserved.
机译:频繁的模式挖掘被认为是发现有用信息的关键任务。尽管频繁模式挖掘算法提供了高质量的解决方案,但大多数方法还是面临着如何减少频繁模式的数量而又不损失信息的挑战。频繁项目集挖掘通过发现减少的频繁项目集集(称为封闭的频繁项目集)解决了这个问题,从中可以恢复整个频繁模式集。但是,对于频繁的相似模式挖掘(其中模式的数量甚至比频繁项集挖掘更大),尚未解决此问题。在本文中,我们介绍了封闭的频繁相似模式挖掘的概念,以发现减少的频繁相似模式集,而不会造成信息丢失。另外,提出了一种新颖的封闭频繁相似模式挖掘算法CFSP-Miner。该算法通过遍历包含所有封闭的频繁相似模式的树来发现频繁模式。为了有效地做到这一点,引入并证明了几种缩小搜索空间的引理。结果表明,CFSP-Miner比最新的频繁相似模式挖掘算法更有效,除非频繁相似模式和封闭频繁相似模式的数量几乎相等。但是,CFSP-Miner能够找到闭合的相似模式,从而减少了发现的频繁相似模式集的大小,而不会造成信息丢失。此外,CFSP-Miner在保持可接受的运行时性能的同时,还具有良好的可伸缩性。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号