首页> 外文学位 >Mining Frequent Sequences in One Database Scan Using Distributed Computers.

【24h】

Mining Frequent Sequences in One Database Scan Using Distributed Computers.

机译：使用分布式计算机在一次数据库扫描中挖掘频繁序列。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Existing frequent-sequence mining algorithms perform multiple scans of a database, or a structure that captures the database. In this M.Sc. thesis, I propose a frequent-sequence mining algorithm that mines each database row as it reads it, so that it can potentially complete mining in the time it takes to read the database once. I achieve this by having my algorithm enumerate all sub-sequences from each row as it reads it.;Since sub-sequence enumeration is a time-consuming process, I create a method to distribute the work over multiple computers, processors, and thread units, while balancing the load between all resources, and limiting the amount of communication so that my algorithm scales well in regards to the number of computers used. Experimental results show that my algorithm is effective, and can potentially complete the mining process in near the time it takes to perform one scan of the input database.

机译：现有的频繁序列挖掘算法对数据库或捕获数据库的结构执行多次扫描。在这个硕士论文中，我提出了一种频繁序列挖掘算法，该算法在读取每个数据库行时对其进行挖掘，以便它有可能在一次读取数据库的时间内完成挖掘。我通过让我的算法枚举每一行读取的所有子序列来实现这一点。由于子序列枚举是一个耗时的过程，因此我创建了一种方法来将工作分配到多台计算机，处理器和线程单元上，同时平衡所有资源之间的负载，并限制通信量，因此我的算法在使用的计算机数量方面可以很好地扩展。实验结果表明，我的算法是有效的，并且有可能在接近对输入数据库进行一次扫描的时间内完成挖掘过程。

著录项

作者
Brajczuk, Dale Allan.;
展开▼
作者单位

University of Manitoba (Canada).;

展开▼
授予单位 University of Manitoba (Canada).;
学科 Computer science.
学位 M.Sc.
年度 2011
页码 168 p.
总页数 168
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Mining weighted frequent sequences in uncertain databases [J] . Rahman Md Mahmudur, Ahmed Chowdhury Farhan, Leung Carson Kai-Sang Information Sciences: An International Journal . 2019,第期

机译：在不确定数据库中挖掘加权频繁序列
2. A novel single scan distributed pattern mining algorithm for frequent pattern identification [J] . T. Sheik Yousuf, M. Indra Devi International journal of data analysis techniques and strategies . 2019,第1期

机译：一种用于频繁模式识别的新型单扫描分布式模式挖掘算法
3. Mining Sequential Patterns More Efficiently by Reducing the Cost of Scanning Sequence Databases [J] . JIAHONG WANG, YOSHIAKI ASANUMA, EIICHIRO KODAMA, 情報処理学会論文誌 . 2006,第12期

机译：通过降低扫描序列数据库的成本更有效地挖掘顺序模式
4. MINING FREQUENT CLOSED ITEMSETS WITH ONE DATABASE SCANNING [C] . YONG QIU, YONG-JIE LAN Proceedings of the 2006 International Conference on Machine Learning and Cybernetics . 2006

机译：具有一个数据库扫描的采矿频率封闭项目
5. A model for mining distributed frequent sequences. [D] . Soliman, Maha Mohamed. 2004

机译：用于挖掘分布式频繁序列的模型。
6. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases [O] . Md. Rezaul Karim, Md. Mamunur Rashid, Byeong-Soo Jeong, 2012

机译：从大型DNA序列数据库中挖掘最大连续频率模式的有效方法
7. Palpatine: Mining Frequent Sequences for Data Prefetching in NoSQL Distributed Key-Value Stores [O] . Sergio Esteves, Joao Nuno Silva, Luis Veiga 2020

机译：PALPATINE：NOSQL分布式键值存储中的数据预取频繁序列

Mining Frequent Sequences in One Database Scan Using Distributed Computers.

摘要

著录项

相似文献

相关主题

期刊订阅