首页> 外文期刊>International journal of intelligent information and database systems >Structures of frequent itemsets and classifying structures of association rule set by order relations
【24h】

Structures of frequent itemsets and classifying structures of association rule set by order relations

机译:频繁项集的结构和按顺序关系建立的关联规则集的分类结构

获取原文
获取原文并翻译 | 示例
           

摘要

This paper shows a mathematical foundation for almost important features in the problem of discovering knowledge by association rules. The sets of itemsets and association rules are partitioned into disjoint classes by two appropriate equivalence relations based on closures. The structure and unique representation of frequent itemsets are figured out through their generators and corresponding eliminable itemsets. Due to this structure, each equivalence rule class is split into different sets of basic and consequence rules according to an order relation. Indeed, the basic set comprises minimal elements (basic rules) whose forms are explicitly showed. Then, we propose operators to non-repeatedly deduce all consequence rules by adding, deleting or moving appropriate eliminable itemsets in both sides of basic rules. Further, we show that mining association rules based on a new order relation, min relation, is better than four other ones in terms of reductions in the time to extract basic rules, their cardinalities and rule lengths. These theoretical results are proven to be reliable. Experimental study on many benchmark databases shows the efficiency of the corresponding algorithms. Our approach (e.g., the partitions of the frequent itemset and association rule sets) is suitable to deal with big data because it can be exploited in parallel and distributed environment.
机译:本文为通过关联规则发现知识的问题中的几乎重要特征提供了数学基础。项目集和关联规则的集合通过基于闭包的两个适当的等价关系划分为不相交的类。频繁项集的结构和唯一表示通过它们的生成器和相应的可消除项集来确定。由于这种结构,每个等价规则类根据顺序关系分为不同的基本规则和结果规则集。实际上,基本集包含其形式明确显示的最小元素(基本规则)。然后,我们建议运算符通过在基本规则的两侧添加,删除或移动适当的可消除项目集来非重复地推导所有结果规则。此外,我们表明,在减少基本规则提取时间,基数和规则长度方面,基于新顺序关系min关系的挖掘关联规则优于其他四个规则。这些理论结果被证明是可靠的。在许多基准数据库上的实验研究表明了相应算法的效率。我们的方法(例如,频繁项集和关联规则集的分区)适合处理大数据,因为它可以在并行和分布式环境中使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号