首页> 外文会议>International Conference on Tools with Artifical Intelligence >Mining association rules in text databases using multipass with inverted hashing and pruning
【24h】

Mining association rules in text databases using multipass with inverted hashing and pruning

机译:使用倒置散列和修剪的MultiPass挖掘文本数据库中的挖掘关联规则

获取原文

摘要

In this paper, we propose a new algorithm named Multipass with Inverted Hashing and Pruning (MIHP) for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, the Apriori algorithm [1] and the Direct Hashing and Pruning (DHP) algorithm [8], are evaluated in the context of mining text databases, and are compared with the proposed MIHP algorithm. It has been shown that the MIHP algorithm has better performance for large text databases.
机译:在本文中,我们提出了一种名为MultiPass的新算法,其中具有反相散列和修剪(MIHP),用于文本数据库中的单词之间的挖掘关联规则。文本数据库的特征与零售事务数据库的特征完全不同,并且由于需要计算的大量项目集(即单词),现有的挖掘算法无法有效处理文本数据库。在采矿文本数据库的上下文中,评估了两个众所周知的挖掘算法,APRiori算法[1]和直接散列和修剪(DHP)算法[8],并与所提出的MIHP算法进行比较。已经表明,MIHP算法对大型文本数据库具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号