首页> 外文期刊>Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on >A Bounded and Adaptive Memory-Based Approach to Mine Frequent Patterns From Very Large Databases
【24h】

A Bounded and Adaptive Memory-Based Approach to Mine Frequent Patterns From Very Large Databases

机译:基于有界和自适应内存的非常大型数据库的矿山频繁模式方法

获取原文
获取原文并翻译 | 示例
           

摘要

Most of the existing methods to solve the problem of association rules mining (ARM) rely on special data structures to project the database (either totally or partially) in the primary memory. Traditionally, these data structures reside in the main memory and rely on the existing paging mechanism of the virtual memory manager (VMM) to handle the storage problem when they go out of the primary memory. Typically, VMM stores the overloaded data into the secondary memory based on some preassumed memory usage criteria. However, this direct and unplanned use of virtual memory results in an unpredictable behavior or thrashing, as depicted by some of the works described in the literature. This problem is tackled in this paper by presenting an ARM model capable of mining a transactional database, regardless of its size and without relying on the underlying VMM; the proposed approach could use only a bounded portion of the primary memory and this gives the opportunity to assign other parts of the main memory to other tasks with different priority. In other words, we propose a specialized memory management system which caters to the needs of the ARM model in such a way that the proposed data structure is constructed in the available allocated primary memory first. If at any point the structure grows out of the allocated memory quota, it is forced to be partially saved on secondary memory. The secondary memory version of the structure is accessed in a block-by-block basis so that both the spatial and temporal localities of the I/O access are optimized. Thus, the proposed framework takes control of the virtual memory access and hence manages the required virtual memory in an optimal way to the best benefit of the mining process to be served. Several clever data structures are used to facilitate these optimizations. Our method has the additional advantage that other tasks of different priorities may run concurrently with the main mining task with as little interference as possibl-n-ne because we do not rely on the default paging mechanism of the VMM. The reported test results demonstrate the applicability and effectiveness of the proposed approach.
机译:解决关联规则挖掘(ARM)问题的大多数现有方法都依赖于特殊的数据结构来将数据库(全部或部分)投影到主内存中。传统上,这些数据结构驻留在主内存中,并且在它们离开主内存时依靠虚拟内存管理器(VMM)的现有分页机制来处理存储问题。通常,VMM根据一些假定的内存使用条件将过载的数据存储到辅助内存中。但是,如文献中描述的一些工作所描绘的那样,对虚拟内存的这种直接和无计划的使用导致了不可预测的行为或崩溃。本文通过提出一种ARM模型来解决此问题,该模型能够挖掘事务数据库,而不管其大小如何,并且无需依赖底层VMM。所提出的方法只能使用主存储器的有限部分,这为将主存储器的其他部分分配给具有不同优先级的其他任务提供了机会。换句话说,我们提出了一种专门的存储器管理系统,该系统可以满足ARM模型的需求,从而首先在可用的已分配主存储器中构造提出的数据结构。如果结构在任何时候都超出了分配的内存配额,则将被强制部分保存在辅助内存中。以块为单位访问该结构的辅助内存版本,以便优化I / O访问的空间和时间位置。因此,所提出的框架控制了虚拟存储器的访问,并因此以最佳方式管理所需的虚拟存储器,以最大程度地利用要服务的挖掘过程。一些聪明的数据结构用于促进这些优化。我们的方法的另一个优点是,具有不同优先级的其他任务可以与主要挖掘任务同时运行,而干扰却不如pos-n-ne少,这是因为我们不依赖于VMM的默认分页机制。报告的测试结果证明了该方法的适用性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号