Aiming at the defects of CloSpan algorithm when mining closed sequential pattern that it needs to maintain the candidate sequences in the first stage and do not make full use of the location infonnation, exists repeatedly scanning database calculating database size, this paper put forward posCloSpan algorithm. By detecting the two-level index structure, the algorithm achieved forward pruning, avoided repeatedly scanning database. At the same time, it trimed non-closed sequences through detecting sup-sequence index table and sub-sequence index table, without saving candidate sequence. Experimental result shows that the algorithm can effectively reduce the time consumption in dealing with longer sequence and the data source that has a large number of duplicated project database.%针对CloSpan算法分两个阶段挖掘闭合序列模式中第一阶段需要保持候选序列且未充分利用项的位置信息、存在对数据库重复扫描和计算大小的不足,提出了posCloSpan算法.算法通过对二级索引结构进行检索实现向前剪枝,避免数据库重复扫描以及对超序索引表、子序索引表的检测,实现非闭合序列的修剪,无须保存候选序列.实验结果证明,算法在处理较长序列以及存在大量重复投影数据库的数据源时,有效降低了时间上的开销.
展开▼