High average-utility sequential pattern mining based on uncertain databases

Lin Jerry Chun-Wei; Li Ting; Pirouz Matin; Zhang Ji; Fournier-Viger Philippe

首页> 外文期刊>Knowledge and information systems >High average-utility sequential pattern mining based on uncertain databases

【24h】

High average-utility sequential pattern mining based on uncertain databases

机译：基于不确定数据库的高平均水性序列模式挖掘

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The emergence and proliferation of the internet of things (IoT) devices have resulted in the generation of big and uncertain data due to the varied accuracy and decay of sensors and their different sensitivity ranges. Since data uncertainty plays an important role in IoT data, mining the useful information from uncertain dataset has become an important issue in recent decades. Past works focus on mining the high sequential patterns from the uncertain database. However, the utility of a derived sequence increases along with the size of the sequence, which is an unfair measure to evaluate the utility of a sequence since any combination of a high-utility sequence will also be the high-utility sequence, even though the utility of a sequence is merely low. In this paper, we address the limitation of the previous potential high-utility sequential pattern mining and present a potentially high average-utility sequential pattern mining framework for discovering the set of potentially high average-utility sequential patterns (PHAUSPs) from the uncertain dataset by considering the size of a sequence, which can provide a fair measure of the patterns than the previous works. First, a baseline potentially high average-utility sequential pattern algorithm and three pruning strategies are introduced to completely mine the set of the desired PHAUSPs. To reduce the computational cost and accelerate the mining process, a projection algorithm called PHAUP is then designed, which leads to a reduction in the size of candidates of the desired patterns. Several experiments in terms of runtime, number of candidates, memory overhead, number of discovered pattern, and scalability are then evaluated on both real-life and artificial datasets, and the results showed that the proposed algorithm achieves promising performance, especially the PHAUP approach.

机译：由于传感器的多种精度和衰减和其不同的灵敏度范围，事物互联网（物联网）设备的出现和增殖导致产生大而不确定的数据。由于数据不确定性在物联网数据中发挥着重要作用，因此近几十年来挖掘不确定数据集的有用信息已成为一个重要问题。过去的工作侧重于从不确定数据库中挖掘高顺序模式。然而，衍生序列的效用随着序列的尺寸而增加，这是评估序列的效用的不公平措施，因为高效序列的任何组合也是高效用序列，即使是序列的效用仅仅是低的。在本文中，我们解决了先前潜在的高效顺序模式挖掘的限制，并提出了一种潜在的高平均水性连续模式挖掘，用于发现来自不确定数据集的潜在高平均水性连续模式（PHAUSPS）的集合考虑到序列的大小，这可以提供比以前的作品的公平衡量标准。首先，引入了基线潜在的高平均水性序列模式算法和三种修剪策略，以完全挖掘所需的Phausps的一套。为了降低计算成本并加速采矿过程，然后设计一种称为PHAUP的投影算法，这导致所需图案的候选尺寸的减小。在运行时的几个实验，候选者的数量，记忆开销，发现模式的数量和可扩展性，并且结果表明该算法实现了有希望的性能，尤其是Phaup方法。

著录项

来源
《Knowledge and information systems》 |2020年第3期|共30页
作者
Lin Jerry Chun-Wei; Li Ting; Pirouz Matin; Zhang Ji; Fournier-Viger Philippe;
展开▼
作者单位

Western Norway Univ Appl Sci Dept Comp Sci Elect Engn &

Math Sci Bergen Norway;

Harbin Inst Technol Shenzhen Sch Comp Sci &

Technol Shenzhen Peoples R China;

Calif State Univ Fresno Dept Comp Sci Fresno CA 93740 USA;

Univ Southern Queesland Sch Agr Computat &

Environm Sci Toowoomba Qld Australia;

Harbin Inst Technol Shenzhen Sch Nat Sci &

Humanities Shenzhen Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词
High average-utility sequential pattern mining; Sequential patterns; Uncertain database; Data mining;

机译：高平均实用程序顺序模式挖掘;顺序模式;不确定的数据库;数据挖掘;

相似文献

外文文献
中文文献
专利

1. High average-utility sequential pattern mining based on uncertain databases [J] . Lin Jerry Chun-Wei, Li Ting, Pirouz Matin, Knowledge and information systems . 2020,第3期

机译：基于不确定数据库的高平均水性序列模式挖掘
2. Mining Probabilistically Frequent Sequential Patterns in Large Uncertain Databases [J] . Zhao Z., Yan D., Ng W. IEEE Transactions on Knowledge and Data Engineering . 2014,第5期

机译：在大型不确定数据库中挖掘概率频率顺序模式
3. Mining top-k sequential patterns in transaction database graphs: A new challenging problem and a sampling-based approach [J] . Lei Mingtao, Chu Lingyang, Wang Zhefeng, World Wide Web . 2020,第1期

机译：在交易数据库图中挖掘top-k顺序模式：一个新的挑战性问题和一种基于采样的方法
4. Mining Sequential Patterns in Uncertain Databases Using Hierarchical Index Structure [C] . Kashob Kumar Roy, Hasibul Haque Moon, Mahmudur Rahman, Pacific-Asia Conference on Knowledge Discovery and Data Mining . 2021

机译：使用层次索引结构的不确定数据库中的挖掘序列模式
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. Mining of high utility-probability sequential patterns from uncertain databases [O] . Binbin Zhang, Jerry Chun-Wei Lin, Philippe Fournier-Viger, 2011

机译：从不确定的数据库中挖掘高实用概率顺序模式
7. Mining of high utility-probability sequential patterns from uncertain databases. [O] . Binbin Zhang, Jerry Chun-Wei Lin, Philippe Fournier-Viger, 2017

机译：从不确定数据库中挖掘高效用概率序列模式。

High average-utility sequential pattern mining based on uncertain databases

摘要

著录项

相似文献

相关主题

期刊订阅