首页>
外国专利>
Method and system for mining generalized sequential patterns in a large database
Method and system for mining generalized sequential patterns in a large database
展开▼
机译:在大型数据库中挖掘广义顺序模式的方法和系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method and apparatus are disclosed for mining generalized sequential patterns from a large database of data sequences, taking into account user specified constraints on the time-gap between adjacent elements of the patterns, sliding time-window, and taxonomies over data items. The invention first identifies the items with at least a minimum support, i.e. , those contained in more than a minimum number of data sequences. The items are used as a seed set to generate candidate sequences. Next, the support of the candidate sequences are counted. The invention then identifies those candidate sequences that are frequent, i. e., those with a support above the minimum support. The frequent candidate sequences are entered into the set of sequential patterns, and are used to generate the next group of candidate sequences. Preferably, the candidate sequences are generated by joining previously found frequent candidate sequences, and candidate sequences having a contiguous subsequence without minimum support are discarded. In addition, the invention includes a hash-tree data structure for storing the candidate sequences and memory management techniques for performance improvement.
展开▼