首页> 外文期刊>Knowledge and Information Systems >Multipass Algorithms for Mining Association Rules in Text Databases
【24h】

Multipass Algorithms for Mining Association Rules in Text Databases

机译:文本数据库中关联规则的多遍算法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In this paper, we propose two new algorithms for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, Apriori algorithm and Direct Hashing and Pruning (DHP) algorithm, are evaluated in the context of mining text databases, and are compared with the new proposed algorithms named Multipass-Apriori (M-Apriori) and Multipass-DHP (M-DHP). It has been shown that the proposed algorithms have better performance for large text databases.
机译:在本文中,我们提出了两种新的算法来挖掘文本数据库中单词之间的关联规则。文本数据库的特征与零售交易数据库的特征完全不同,并且由于需要计算大量的项目集(即单词),因此现有的挖掘算法无法有效地处理文本数据库。在挖掘文本数据库的上下文中评估了两种著名的挖掘算法Apriori算法和直接哈希与修剪(DHP)算法,并将它们与新提议的算法Multipass-Apriori(M-Apriori)和Multipass-DHP进行了比较。 (M-DHP)。已经表明,所提出的算法对于大型文本数据库具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号