...
首页> 外文期刊>The Journal of Systems and Software >An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports
【24h】

An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports

机译:一种有效的基于树的算法,用于挖掘具有多个最小支持的顺序模式

获取原文
获取原文并翻译 | 示例

摘要

Sequential pattern mining (SPM) is an important technique for determining time-related behavior in sequence databases. In real-life applications, the frequencies for various items in a sequence database are not exactly equal. If all items are set with the same minimum support, the rare item problem may result, meaning that we are unable to effectively retrieve interesting patterns regardless of whether minsup is set too high or too low. Liu (2006) first included the concept of multiple minimum supports (MMSs) to SPM. It allows users to specify the minimum item support (MIS) for each item according to its natural frequency. A generalized sequential pattern-based algorithm, named Multiple Supports - Generalized Sequential Pattern (MS-GSP), was also developed to mine complete set of sequential patterns. However, the MS-GSP adopts candidate generate-and-test approach, which has been recognized as a costly and time-consuming method in pattern discovery. For the efficient mining of sequential patterns with MMSs, this study first proposes a compact data structure, called a Preorder Linked Multiple Supports tree (PLMS-tree), to store and compress the entire sequence database. Based on a PLMS-tree, we develop an efficient algorithm, Multiple Supports - Conditional Pattern growth (MSCP-growth), to discover the complete set of patterns. The experimental result shows that the proposed approach achieves more preferable findings than the MS-GSP and the conventional SPM.
机译:顺序模式挖掘(SPM)是一种确定序列数据库中与时间相关的行为的重要技术。在实际应用中,序列数据库中各个项目的频率并不完全相等。如果所有项目都设置有相同的最低支持,则可能会导致稀有项目问题,这意味着无论minsup设置得太高还是太低,我们都无法有效地检索有趣的模式。 Liu(2006)首先将多最小支持(MMS)的概念包含在SPM中。它允许用户根据其自然频率为每个项目指定最小项目支持(MIS)。还开发了一种基于通用顺序模式的算法,称为多重支持-通用顺序模式(MS-GSP),以挖掘完整的顺序模式集。但是,MS-GSP采用候选的生成和测试方法,该方法已被公认为是模式发现中一种既昂贵又费时的方法。为了利用MMS有效地挖掘顺序模式,本研究首先提出了一种紧凑的数据结构,称为预链接多支持树(PLMS-tree),用于存储和压缩整个序列数据库。基于PLMS树,我们开发了一种有效的算法,即多重支持-条件模式增长(MSCP-growth),以发现模式的完整集合。实验结果表明,所提出的方法比MS-GSP和常规SPM获得了更好的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号