首页> 外文会议>Annual ACM symposium on applied computing;ACM symposium on applied computing;SAC 2010 >A persistent HY-Tree to efficiently support itemset mining on large datasets
【24h】

A persistent HY-Tree to efficiently support itemset mining on large datasets

机译:持久的HY-Tree,可有效支持大型数据集上的项集挖掘

获取原文

摘要

This paper presents the HY-Tree persistent tree structure that provides a compact representation of a transactional dataset for frequent itemset mining. The HY-Tree is characterized by a hybrid structure that easily adapts to different data distributions. The data representation is complete, since no support threshold is enforced during the HY-TREE creation process. The HY-Tree can be profitably exploited by a variety of itemset mining algorithms (e.g., LCM v.2, nonordFP). It effectively supports the data retrieval step in the itemset mining process by reducing both the I/O cost and the memory requirements for data loading. Experiments on large synthetic datasets show the compactness of the HY-Tree data representation and the efficiency and scalability on large datasets of the mining algorithms supported by it.
机译:本文介绍了HY-Tree持久树结构,该结构为频繁项集挖掘提供了事务性数据集的紧凑表示。 HY-Tree的特征在于混合结构,可以轻松适应不同的数据分布。数据表示是完整的,因为在HY-TREE创建过程中没有实施支持阈值。可以通过各种项目集挖掘算法(例如LCM v.2,nonordFP)来有益地利用HY-Tree。它通过降低I / O成本和数据加载的内存要求,有效地支持了项集挖掘过程中的数据检索步骤。在大型综合数据集上进行的实验表明,HY-Tree数据表示的紧凑性及其支持的挖掘算法在大型数据集上的效率和可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号