【24h】

Itemset Mining on Indexed Data Blocks

机译:索引数据块上的项目集挖掘

获取原文

摘要

This paper presents a novel index, called I-Forest, to support data mining activities on evolving databases, whose content is periodically updated through insertion (or deletion) of data blocks. I-Forest allows the extraction of itemsets from transactional databases such as transactional data from large retail chains. Item, support and time constraints may be enforced during the extraction phase. The proposed index is a covering index that represents transactional blocks in a succinct form and allows different kinds of analysis (e.g., analyze quarterly data). During the creation phase no support constraint is enforced. Thus, the index provides a complete representation of the evolving data. The I-Forest index has been implemented into the Post-greSQL open source DBMS and exploits its physical level access methods. Experiments have been run for both sparse and dense data distributions. The execution time of the frequent itemset extraction task exploiting the index is always comparable with and for low support threshold faster than the Prefix-Tree algorithm accessing static data on at file.
机译:本文介绍了一个名为I-Forest的新索引,以支持在不断发展的数据库上的数据挖掘活动,其内容通过数据块的插入(或删除)定期更新。 I-Forest允许从交易数据库中提取项目集,例如来自大型零售链的事务数据。在提取阶段期间可以强制执行项目,支持和时间约束。所提出的指数是一种覆盖索引,其以简洁的形式代表事务块,并允许不同种类的分析(例如,分析季度数据)。在创建阶段,不强制执行支持约束。因此,该索引提供了不​​断变化数据的完整表示。 I-Forest索引已在后GRESQL开源DBMS中实现并利用其物理级别访问方法。已经为稀疏和密集数据分布进行了实验。频繁的项目集提取任务的执行时间始终与在访问文件上访问静态数据的前缀树算法时始终与低支持阈值相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号