首页> 外文会议>International Conference on Web Information Systems Engineering >FreeS: A Fast Algorithm to Discover Frequent Free Subtrees Using a Novel Canonical Form
【24h】

FreeS: A Fast Algorithm to Discover Frequent Free Subtrees Using a Novel Canonical Form

机译:释放:使用新颖的规范形式发现频繁的自由子树的快速算法

获取原文

摘要

Web data can often be represented in free tree form; however, free tree mining methods seldom exist. In this paper, a computationally fast algorithm FreeS is presented to discover all frequently occurring free subtrees in a database of labelled free trees. FreeS is designed using an optimal canonical form, BOCF that can uniquely represent free trees even during the presence of isomorphism. To avoid enumeration of false positive candidates, it utilises the enumeration approach based on a tree-structure guided scheme. This paper presents lemmas that introduce conditions to conform the generation of free tree candidates during enumeration. Empirical study using both real and synthetic datasets shows that FreeS is scalable and significantly outperforms (i.e. few orders of magnitude faster than) the state-of-the-art frequent free tree mining algorithms, HybridTreeMiner and FreeTreeMiner.
机译:Web数据通常可以以自由树形式表示;但是,自由树采矿方法很少存在。在本文中,提出了一种计算快速算法,以发现标记的自由树数据库中的所有经常发生的自由子树。释放是使用最佳规范形式设计的,即使在同构在同构在存在时也可以独特地代表自由树木。为避免枚举误候选,它利用了基于树结构引导方案的枚举方法。本文介绍了引入枚举期间符合自由树候选的条件的lemmas。使用真实和合成数据集的实证研究表明,释放是可扩展的并且显着优于胜过(即,数量次数速度快于)最先进的频繁的自由树挖掘算法,HybridTreeMiner和FreeTreeminer。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号