...
首页> 外文期刊>Annals of Mathematics and Artificial Intelligence >A fast compound algorithm for mining generators,closed itemsets, and computing links between equivalence classes
【24h】

A fast compound algorithm for mining generators,closed itemsets, and computing links between equivalence classes

机译:用于挖掘生成器,封闭项目集以及计算等价类之间的链接的快速复合算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In pattern mining and association rule mining, there is a variety of algorithms for mining frequent closed itemsets (FCIs) and frequent generators (FGs), whereas a smaller part further involves the precedence relation between FCIs. The interplay of these three constructs and their joint computation have been studied within the formal concept analysis (FCA) field yet none of the proposed algorithms is scalable. In frequent pattern mining, at least one suite of efficient algorithms has been designed that exploits basically the same ideas and follows the same overall computational schema. Based on an in-depth analysis of the aforementioned interplay that is rooted in a fundamental duality from hypergraph theory, we propose a new schema that should enable for a more parsimonious computation. We exemplify the new schema in the design of Snow-Touch, a concrete FCI/FG/precedence miner that reuses an existing algorithm, Charm, for mining FCIs, and completes it with two original methods for mining FGs and precedence, respectively. The performance of Snow-Touch and of its closest competitor, Charm-L, were experimentally compared using a large variety of datasets. The outcome of the experimental study suggests that our method outperforms Charm-L on dense data while on sparse one the trend is reversed. Furthermore, we demonstrate the usefulness of our method and the new schema through an application to the analysis of a genome dataset. The initial results reported here confirm the capacity of the method to focus on significant associations.
机译:在模式挖掘和关联规则挖掘中,有多种算法可用于挖掘频繁闭项集(FCI)和频繁生成器(FG),而一小部分还涉及FCI之间的优先级关系。在形式概念分析(FCA)领域中已经研究了这三种结构的相互作用以及它们的联合计算,但是所提出的算法均不可扩展。在频繁模式挖掘中,已设计出至少一套有效算法,这些算法利用基本相同的思想并遵循相同的总体计算方案。基于对上述相互作用的深入分析(其根源于超图理论的基本对偶性),我们提出了一种新方案,该方案应能够进行更简化的计算。我们在Snow-Touch的设计中举例说明了新模式,Snow-Touch是一个具体的FCI / FG /优先级挖掘器,它重用现有算法Charm来挖掘FCI,并分别用两种原始方法来挖掘FG和优先级来完善它。使用大量数据集,通过实验比较了Snow-Touch及其最接近的竞争对手Charm-L的性能。实验研究的结果表明,在密集数据上,我们的方法优于Charm-L;而在稀疏数据上,该方法则相反。此外,我们通过在基因组数据集分析中的应用展示了我们的方法和新模式的有用性。此处报告的初步结果证实了该方法专注于重要关联的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号