...
首页> 外文期刊>Information and software technology >Fast mining of frequent tree structures by hashing and indexing
【24h】

Fast mining of frequent tree structures by hashing and indexing

机译:通过哈希和索引快速挖掘频繁的树结构

获取原文
获取原文并翻译 | 示例
           

摘要

Hierarchical semistructured data arise frequently in the Web. or in biological information processing applications. Semistructured objects describing the same type of information have similar but not identical structure. Usually they share some common 'schema'. Finding the common schema of a collection of semistructured objects is a very important task and due to the huge amount of such data encountered, data mining techniques have been employed. In this paper, we study the problem of discovering frequently occurring structures in semistructured objects using the notion of association rules. We identify that discovering the frequent structures in the early phases of the mining procedure is the dominant cost and we provide a fast algorithm addressing this issue. We present experimental results, which demonstrate the superiority of the proposed algorithm and also its efficiency in reducing dramatically the processing cost.
机译:分层半结构化数据经常在Web中出现。或在生物信息处理应用中。描述相同类型信息的半结构化对象具有相似但不相同的结构。通常他们共享一些共同的“模式”。找到半结构化对象集合的通用模式是一项非常重要的任务,由于遇到的此类数据量巨大,因此已采用了数据挖掘技术。在本文中,我们研究使用关联规则概念在半结构化对象中发现频繁出现的结构的问题。我们发现在采矿过程的早期阶段发现频繁的结构是主要的成本,并且我们提供了解决此问题的快速算法。我们目前的实验结果表明了该算法的优越性,并有效降低了处理成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号