首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets
【24h】

On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets

机译:关于数据集的高效表示为挖掘最大频繁项集的图表

获取原文
获取原文并翻译 | 示例
           

摘要

Frequent itemsets mining is an active research problem in the domain of data mining and knowledge discovery. With the advances in database technology and an exponential increase in data to be stored, there is a need for efficient approaches that can quickly extract useful information from such large datasets. Frequent Itemsets (FIs) mining is a data mining task to find itemsets in a transactional database which occur together above a certain frequency. Finding these FIs usually requires multiple passes over the databases; therefore, making efficient algorithms crucial for mining FIs. This work presents a graph-based approach for representing a complete transactional database. The proposed graph-based representation enables the storing of all relevant information (for extracting FIs) of the database in one pass. Later, an algorithm that extracts the FIs from the graph-based structure is presented. Experimental results are reported comparing the proposed approach with 17 related FIs mining methods using six benchmark datasets. Results show that the proposed approach performs better than others in terms of time.
机译:频繁的项目挖掘是数据挖掘和知识发现领域的积极研究问题。随着数据库技术的进步和要存储的数据的指数增加,需要有效的方法,可以快速从这些大型数据集中提取有用信息。频繁的项目集(FIS)挖掘是一种数据挖掘任务,可以在一定频率上一起出现的事务数据库中找到项目集。找到这些FIS通常需要多次通过数据库;因此,高效的算法对于采矿FIS至关重要。这项工作提出了一种基于图形的方法,用于表示完整的事务数据库。所提出的基于图形的表示,可以在一次通过中存储数据库的所有相关信息(用于提取FIS)。稍后,提出了一种从基于图形的结构中提取FIS的算法。报告了使用六个基准数据集比较了使用六个基准数据集的17个相关的FIS挖掘方法的提出方法。结果表明,在时间方面,该方法的表现比其他方法更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号