首页> 外国专利> System and method for sequence-based subspace pattern clustering

System and method for sequence-based subspace pattern clustering

机译:基于序列的子空间模式聚类的系统和方法

摘要

Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rise and fall in subspaces. Pattern-based clustering extends the concept of traditional clustering and benefits a wide range of applications, including e-Commerce target marketing, bioinformatics (large scale scientific data analysis), and automatic computing (web usage analysis), etc. However, state-of-the-art pattern-based clustering methods (e.g., the pCluster algorithm) can only handle datasets of thousands of records, which makes them inappropriate for many real-life applications. Furthermore, besides the huge data volume, many data sets are also characterized by their sequentiality, for instance, customer purchase records and network event logs are usually modeled as data sequences. Hence, it becomes important to enable pattern-based clustering methods i) to handle large datasets, and ii) to discover pattern similarity embedded in data sequences. There is presented herein a novel method that offers this capability.
机译:与传统的聚类方法专注于在一组维上对具有相似值的对象进行分组不同,通过模式相似性进行聚类可以找到在子空间中呈现出一致的上升和下降模式的对象。基于模式的集群扩展了传统集群的概念,并受益于广泛的应用,包括电子商务目标市场营销,生物信息学(大规模科学数据分析)和自动计算(Web使用率分析)等。最新的基于模式的聚类方法(例如pCluster算法)只能处理成千上万条记录的数据集,这使其不适用于许多实际应用。此外,除了庞大的数据量外,许多数据集还具有顺序性,例如,客户购买记录和网络事件日志通常被建模为数据序列。因此,使基于模式的聚类方法i)处理大型数据集和ii)发现嵌入在数据序列中的模式相似性变得很重要。本文提出了提供这种能力的新颖方法。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号