首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Mining closed and maximal frequent subtrees from databases of labeled rooted trees
【24h】

Mining closed and maximal frequent subtrees from databases of labeled rooted trees

机译:从带标签的根树数据库中挖掘封闭的最大子树

获取原文
获取原文并翻译 | 示例

摘要

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. Because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of frequent subtrees and, therefore, mining all frequent subtrees becomes infeasible for large tree sizes. We present CMTreeMiner, a computationally efficient algorithm that discovers only closed and maximal frequent subtrees in a database of labeled rooted trees, where the rooted trees can be either ordered or unordered. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all frequent subtrees. Several techniques are proposed to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. Heuristic techniques are used to arrange the order of computation so that relatively expensive computation is avoided as much as possible. We study the performance of our algorithm through extensive experiments, using both synthetic data and data sets from real applications. The experimental results show that our algorithm is very efficient in reducing the search space and quickly discovers all closed and maximal frequent subtrees.
机译:树结构广泛用于计算生物学,模式识别,XML数据库,计算机网络等领域。挖掘树木数据库的一个重要问题是找到频繁出现的子树。由于组合爆炸,频繁子树的数量通常随频繁子树的大小呈指数增长,因此,对于大树而言,挖掘所有频繁子树变得不可行。我们提出了CMTreeMiner,这是一种计算效率高的算法,它仅在标记的有根树的数据库中发现封闭的和最大的频繁子树,其中有根树可以是有序的也可以是无序的。该算法通过遍历系统地枚举所有频繁子树的枚举树来挖掘封闭的和最大的频繁子树。提出了几种技术来修剪不对应于封闭或最大频繁子树的枚举树的分支。启发式技术用于安排计算顺序,以便尽可能避免使用相对昂贵的计算。我们使用综合数据和实际应用中的数据集,通过广泛的实验研究了算法的性能。实验结果表明,我们的算法在减少搜索空间和快速发现所有闭合和最大频繁子树方面非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号