Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Jiawei Han; Jian Pei; Yiwen Yin; Runying Mao

首页> 外文期刊>Data mining and knowledge discovery >Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

【24h】

Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

机译：没有候选生成的频繁模式：频繁模式树方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation in still costly, especially when there exist a large number of patterns and/or long patterns. In this study, we propose a novel frequent-pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-treebased mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a condensed, smaller data structure, FP-tree which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern-fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent-pattern mining methods.

机译：在数据挖掘研究中普遍地研究了交易数据库中的频繁模式，时间序列数据库和许多其他类型的数据库。以前的大多数研究采用了一种类似的候选集合生成和测试方法。然而，候选集合仍然昂贵，特别是当存在大量模式和/或长图案时。在这项研究中，我们提出了一种新的频繁模式树（FP-Tree）结构，它是一种用于存储有关频繁模式的压缩的扩展前缀结构的扩展前缀结构，并开发出高效的FP-TreeBased采矿方法FP-Grower ，通过模式片段生长挖掘完整的频繁模式。采用挖掘效率采用三种技术实现：（1）大型数据库被压缩成浓缩，较小的数据结构，FP树，避免昂贵，重复的数据库扫描，（2）我们的FP-Tree的矿业采用了模式 - 片段生长方法以避免昂贵的候选集的成本生成，并且（3）基于分区的，划分的，划分方法用于将挖掘任务分解为一组较小的任务，以便在条件下挖掘限制模式数据库，大大减少了搜索空间。我们的绩效研究表明，FP-Grange方法对于挖掘长短频繁的模式，FP-Growce方法是高效且可扩展的，并且大约比Apriori算法快，而且比最近报告的新常规模式采矿方法更快。

著录项

来源
《Data mining and knowledge discovery》 |2004年第1期|共35页
作者
Jiawei Han; Jian Pei; Yiwen Yin; Runying Mao;
展开▼
作者单位

University of Illinois at Urbana-Champaign;

State University of New York at Buffalo;

Simon Fraser University;

Microsoft Corporation;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
frequent pattern mining; association mining; algorithm; performance improvements; data structure;

机译：频繁的模式挖掘;协会挖掘;算法;性能改进;数据结构;

相似文献

外文文献
中文文献
专利

1. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach [J] . Jiawei Han, Jian Pei, Yiwen Yin, Data mining and knowledge discovery . 2004,第1期

机译：没有候选生成的频繁模式：频繁模式树方法
2. Improved Pattern Tree for Incremental Frequent-Pattern Mining [J] . ZHOU Ming, WANG Taiyong 天津大学学报（英文版） . 2010,第002期

机译：改进的模式树，用于增量式频繁模式挖掘
3. A Secure Association Rules Mining Scheme Based on Frequent-Pattern Tree [J] . Chunhua SU, Kouichi SAKURAI 電子情報通信学会技術研究報告 . 2008,第162期

机译：基于频繁树的安全关联规则挖掘方案
4. The Combinations of Frequent Pattern Tree and Candidate Generation for Mining Frequent Patterns [C] . Yen, Show-Jane, Lee, Future Generation Communication and Networking Symposia, FGCNS, 2008 Second International Conference on . 2008

机译：频繁模式树与候选生成相结合的频繁模式挖掘。
5. Frequent pattern mining without candidate generation or support constraint. [D] . Cheung, William. 2003

机译：没有候选者生成或支持约束的频繁模式挖掘。
6. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases [O] . Md. Rezaul Karim, Md. Mamunur Rashid, Byeong-Soo Jeong, 2012

机译：从大型DNA序列数据库中挖掘最大连续频率模式的有效方法
7. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach [O] . Jiawei Han, Jian Pei, Yiwen Yin, 2004

机译：挖掘没有候选者生成的频繁模式：频繁模式树方法
8. Crime Pattern Analysis: A Spatial Frequent Pattern Mining Approach. [R] . D. Oliver P. Mohan S. Shekhar X. Zhou 2012

机译：犯罪模式分析：一种空间频繁模式挖掘方法。

Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

摘要

著录项

相似文献

相关主题

期刊订阅