首页> 外文期刊>Information Systems >Tree pattern mining with tree automata constraints
【24h】

Tree pattern mining with tree automata constraints

机译:具有树自动机约束的树模式挖掘

获取原文
获取原文并翻译 | 示例
       

摘要

Most work on pattern mining focuses on simple data structures such as itemsets and sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structures, XML and Web log databases and social networks, require much more sophisticated data structures such as trees and graphs. In these contexts, interesting patterns involve not only frequent object values (labels) appearing in the graphs (or trees) but also frequent specific topologies found in these structures. Recently, several techniques for tree and graph mining have been proposed in the literature. In this paper, we focus on constraint-based tree pattern mining. We propose to use tree automata as a mechanism to specify user constraints over tree patterns. We present the algorithm CoBMiner which allows user constraints specified by a tree automata to be incorporated in the mining process. An extensive set of experiments executed over synthetic and real data (XML documents and Web usage logs) allows us to conclude that incorporating constraints during the mining process is far more effective than filtering the interesting patterns after the mining process.
机译:模式挖掘的大多数工作都集中在简单的数据结构上,例如项目集和项目集序列。但是,最近许多处理复杂数据(例如化合物,蛋白质结构,XML和Web日志数据库以及社交网络)的应用程序都需要更加复杂的数据结构(例如树和图)。在这些情况下,有趣的模式不仅涉及在图形(或树)中出现的频繁的对象值(标签),而且还涉及在这些结构中发现的频繁的特定拓扑。最近,文献中提出了几种用于树和图挖掘的技术。在本文中,我们专注于基于约束的树模式挖掘。我们建议使用树自动机作为对树模式指定用户约束的机制。我们提出了CoBMiner算法,该算法允许将由树自动机指定的用户约束纳入挖掘过程。对合成和真实数据(XML文档和Web使用日志)执行的大量实验使我们得出结论,在挖掘过程中并入约束比在挖掘过程之后过滤有趣的模式要有效得多。

著录项

  • 来源
    《Information Systems》 |2010年第5期|p.570-591|共22页
  • 作者单位

    Faculdade de Computacao - Universidade Federal de Uberlandia, Campus Santa Monica, Bloco B - Uberlandia, MG, Brazil;

    rnFaculdade de Computacao - Universidade Federal de Uberlandia, Campus Santa Monica, Bloco B - Uberlandia, MG, Brazil;

    rnFaculdade de Computacao - Universidade Federal de Uberlandia, Campus Santa Monica, Bloco B - Uberlandia, MG, Brazil;

    rnFaculdade de Computacao - Universidade Federal de Uberlandia, Campus Santa Monica, Bloco B - Uberlandia, MG, Brazil;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    frequent pattern discovery; tree pattern mining; tree automata; constraint-based mining; XML mining; web mining;

    机译:频繁发现模式;树型挖掘;树自动机基于约束的挖掘XML挖掘;网络挖掘;
  • 入库时间 2022-08-18 02:48:03

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号