首页> 外文学位 >A formal concept analysis approach to association rule mining: The QuICL algorithms.
【24h】

A formal concept analysis approach to association rule mining: The QuICL algorithms.

机译:关联规则挖掘的正式概念分析方法:QuICK算法。

获取原文
获取原文并翻译 | 示例

摘要

Association rule mining (ARM) is the task of identifying meaningful implication rules exhibited in a data set. Most research has focused on extracting frequent item (FI) sets and thus fallen short of the overall ARM objective. The FI miners fail to identify the upper covers that are needed to generate a set of association rules whose size can be exploited by an end user. An alternative to FI mining can be found in formal concept analysis (FCA), a branch of applied mathematics. FCA derives a concept lattice whose concepts identify closed FI sets and connections identify the upper covers. However, most FCA algorithms construct a complete lattice and therefore include item sets that are not frequent. An iceberg lattice, on the other hand, is a concept lattice whose concepts contain only FI sets. Only three algorithms to construct an iceberg lattice were found in literature. Given that an iceberg concept lattice provides an analysis tool to succinctly identify association rules, this study investigated additional algorithms to construct an iceberg concept lattice. This report presents the development and analysis of the Quick Iceberg Concept Lattice (QuICL) algorithms. These algorithms provide incremental construction of an iceberg lattice. QuICL uses recursion instead of iteration to navigate the lattice and establish connections, thereby eliminating costly processing incurred by past algorithms. The QuICL algorithms were evaluated against leading FI miners and FCA construction algorithms using benchmarks cited in literature. Results demonstrate that QuICL provides performance on the order of FI miners yet additionally derive the upper covers. QuICL, when combined with known algorithms to extract a basis of association rules from a lattice, offer a "best known" ARM solution. Beyond this, the QuICL algorithms have proved to be very efficient, providing an order of magnitude gains over other incremental lattice construction algorithms. For example, on the Mushroom data set, QuICL completes in less than 3 seconds. Past algorithms exceed 200 seconds. On T10I4D100k, QuICL completes in less than 120 seconds. Past algorithms approach 10,000 seconds. QuICL is proved to be the "best known" all around incremental lattice construction algorithm. Runtime complexity is shown to be O(ℓ d i) where ℓ is the cardinality of the lattice, d is the average degree of the lattice, and i is a mean function on the frequent item extents.
机译:关联规则挖掘(ARM)是识别数据集中显示的有意义的蕴涵规则的任务。大多数研究都集中在提取频繁项目(FI)集上,因此未能达到ARM总体目标。 FI矿工无法识别生成关联规则集所需的上盖,这些关联规则的大小可以被最终用户利用。 FI挖掘的替代方法可以在形式概念分析(FCA)(应用数学的一个分支)中找到。 FCA派生出一个概念晶格,其概念标识封闭的FI集,而连接标识上盖。但是,大多数FCA算法会构造一个完整的晶格,因此会包含不频繁的项目集。另一方面,冰山晶格是概念晶格,其概念仅包含FI集。在文献中仅发现了三种构造冰山晶格的算法。鉴于冰山概念格提供了一种分析工具来简洁地识别关联规则,因此本研究调查了其他构建冰山概念格的算法。本报告介绍了快速冰山概念格(QuICL)算法的开发和分析。这些算法提供了冰山晶格的增量构造。 QuICL使用递归而不是迭代来导航晶格并建立连接,从而消除了过去算法带来的昂贵处理。使用文献中引用的基准,针对领先的FI矿工和FCA构建算法对QuICL算法进行了评估。结果表明,QuICL可提供与FI矿工相同的性能,并且还可以得出上层覆盖率。 QuICL与已知算法结合以从晶格中提取关联规则的基础时,可提供“最知名”的ARM解决方案。除此之外,QuICL算法已被证明非常有效,与其他增量晶格构造算法相比,可提供一个数量级的增益。例如,在“蘑菇”数据集上,QuICL不到3秒即可完成。过去的算法超过200秒。在T10I4D100k上,QuICL不到120秒即可完成。过去的算法接近10,000秒。 QuICL被证明是所有增量格网构造算法中“最知名的”。运行时复杂度显示为O(ℓ d i),其中ℓ是晶格的基数,d是晶格的平均度,i是频繁项范围上的均值函数。

著录项

  • 作者

    Smith, David T.;

  • 作者单位

    Nova Southeastern University.;

  • 授予单位 Nova Southeastern University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 339 p.
  • 总页数 339
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:37:53

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号