【24h】

Mining Frequent Rooted Ordered Tree Generators Efficiently

机译:挖掘频繁的根有序树发电机有效

获取原文

摘要

With the wide applications of tree structured data, such as XML databases, research of mining frequent sub tree patterns have recently attracted much attention in the data mining and database communities. Due to the downward closure property, mining complete frequent sub tree patterns can lead to an exponential number of results. Although the existing studies have proposed several alleviative solutions (i.e. mining frequent closed sub tree patterns or maximal sub tree patterns) to compress the size of large results, the existing solutions are not suitable some real applications, such as frequent pattern-based classification. Furthermore, according to the Minimum Description Length (MDL) Principle, frequent rooted sub trees generators are preferable to frequent closed/maximal sub tree patterns in the applications of frequent pattern-based classification. In this paper, we study a novel problem of mining frequent rooted ordered tree generators. To speed up the efficiency of mining process, we propose a depth-first-search-based framework. Moreover, two effective pruning strategies are integrated into the framework to reduce the search space and avoid redundant computation. Finally, we verify the effectiveness and efficiency of our proposed approaches through extensive experiments.
机译:随着树结构数据的广泛应用,如XML数据库,挖掘频繁子树模式的研究最近在数据挖掘和数据库社区中引起了很多关注。由于下闭合属性,采矿完整频繁的子树模式可导致指数级结果。虽然现有研究提出了几种缓解解决方案(即,频繁闭合的子树图案或最大次树图案)来压缩大结果的大小,但现有的解决方案不适合一些实际应用,例如频繁的基于模式的分类。此外,根据最小描述长度(MDL)原理,频繁的生根树生成器是优选频繁关闭/最大子树模式的频繁的基于模式的分类中的频繁闭合/最大子树模式。在本文中,我们研究了采矿频繁的繁殖的新颖问题。为了加快采矿过程的效率,我们提出了一种基于深度的基于搜索的框架。此外,两种有效的修剪策略集成到框架中以减少搜索空间并避免冗余计算。最后,我们通过广泛的实验验证了我们提出的方法的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号