首页> 美国卫生研究院文献>PLoS Clinical Trials >Mining of high utility-probability sequential patterns from uncertain databases
【2h】

Mining of high utility-probability sequential patterns from uncertain databases

机译:从不确定的数据库中挖掘高实用概率顺序模式

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

High-utility sequential pattern mining (HUSPM) has become an important issue in the field of data mining. Several HUSPM algorithms have been designed to mine high-utility sequential patterns (HUPSPs). They have been applied in several real-life situations such as for consumer behavior analysis and event detection in sensor networks. Nonetheless, most studies on HUSPM have focused on mining HUPSPs in precise data. But in real-life, uncertainty is an important factor as data is collected using various types of sensors that are more or less accurate. Hence, data collected in a real-life database can be annotated with existing probabilities. This paper presents a novel pattern mining framework called high utility-probability sequential pattern mining (HUPSPM) for mining high utility-probability sequential patterns (HUPSPs) in uncertain sequence databases. A baseline algorithm with three optional pruning strategies is presented to mine HUPSPs. Moroever, to speed up the mining process, a projection mechanism is designed to create a database projection for each processed sequence, which is smaller than the original database. Thus, the number of unpromising candidates can be greatly reduced, as well as the execution time for mining HUPSPs. Substantial experiments both on real-life and synthetic datasets show that the designed algorithm performs well in terms of runtime, number of candidates, memory usage, and scalability for different minimum utility and minimum probability thresholds.
机译:高效的顺序模式挖掘(HUSPM)已成为数据挖掘领域的重要问题。已经设计了几种HUSPM算法来挖掘高功能顺序模式(HUPSP)。它们已应用于几种现实生活中,例如用于消费者行为分析和传感器网络中的事件检测。尽管如此,关于HUSPM的大多数研究都集中在以精确数据挖掘HUPSP。但是在现实生活中,不确定性是一个重要因素,因为使用或多或少准确的各种类型的传感器收集数据。因此,可以使用现有概率来注释在现实生活数据库中收集的数据。本文提出了一种新颖的模式挖掘框架,称为高效用概率顺序模式挖掘(HUPSPM),用于在不确定序列数据库中挖掘高效用概率顺序模式(HUPSP)。提出了具有三种可选修剪策略的基线算法来挖掘HUPSP。为了加快挖掘速度,Moroever设计了一种投影机制来为每个处理后的序列创建一个数据库投影,该投影小于原始数据库。因此,可以大大减少没有希望的候选对象的数量,以及挖掘HUPSP的执行时间。在现实和合成数据集上的大量实验表明,对于不同的最小效用和最小概率阈值,所设计的算法在运行时间,候选数,内存使用以及可伸缩性方面表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号