...
首页> 外文期刊>Knowledge and information systems >Canonical forms for labelled trees and their applications in frequent subtree mining
【24h】

Canonical forms for labelled trees and their applications in frequent subtree mining

机译:标记树的规范形式及其在频繁子树挖掘中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we first present two canonical forms for labelled rooted unordered trees-the breadth-first canonical form (BFCF) and the depth-first canonical form (DFCF). Then the canonical forms are applied to the frequent subtree mining problem. Based on the BFCF, we develop a vertical mining algorithm, RootedTreeMiner, to discover all frequently occurring subtrees in a database of labelled rooted unordered trees. The RootedTreeMiner algorithm uses an enumeration tree to enumerate all (frequent) labelled rooted unordered subtrees. Next, we extend the definition of the DFCF to labelled free trees and present an Apriori-like algorithm, FreeTreeMiner, to discover all frequently occurring subtrees in a database of labelled free trees. Finally, we study the performance and the scalability of our algorithms through extensive experiments based on both synthetic data and datasets from real applications.
机译:树结构广泛用于计算生物学,模式识别,XML数据库,计算机网络等领域。在本文中,我们首先为标记的有根无序树提供两种规范形式:广度优先规范形式(BFCF)和深度优先规范形式(DFCF)。然后将规范形式应用于频繁子树挖掘问题。基于BFCF,我们开发了一种垂直挖掘算法RootedTreeMiner,以发现带标签的无根树的数据库中所有频繁出现的子树。 RootedTreeMiner算法使用枚举树来枚举所有(频繁的)带标签的根无序子树。接下来,我们将DFCF的定义扩展到带标签的自由树,并提出一种类似于Apriori的算法FreeTreeMiner,以发现带标签的自由树数据库中所有频繁出现的子树。最后,我们通过基于合成数据和来自实际应用程序的数据集的广泛实验,研究了算法的性能和可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号