首页> 外文期刊>Data & Knowledge Engineering >FRACTURE mining: Mining frequently and concurrently mutating structures from historical XML documents
【24h】

FRACTURE mining: Mining frequently and concurrently mutating structures from historical XML documents

机译:断裂挖掘:频繁并同时从历史XML文档中挖掘结构的挖掘

获取原文
获取原文并翻译 | 示例

摘要

In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositories. However, to the best of our knowledge, none of existing work on XML mining has taken into account of the dynamic nature of XML documents as online information. The present article proposes a novel type of frequent pattern, namely, FRequently And Concurrently muTating substructUREs (FRACTURE), that is mined from the evolution of an XML document. A discovered FRACTURE is a set of substructures of an XML document that frequently change together. Knowledge obtained from FRACTURE is useful in applications such as XML indexing, XML clustering etc. In order to keep the result patterns concise and explicit, we further formulate the problem of maximal FRACTURE mining. Two algorithms, which employ the level-wise and divide-and-conquer strategies respectively, are designed to mine the set of FRACTUREs. The second algorithm, which is more efficient, is also optimized to discover the set of maximal FRACTURES. Experiments involving a wide range of synthetic and real-life datasets verify the efficiency and scalability of the developed algorithms.
机译:在过去的几年中,可用的XML文档的快速增长激发了人们从XML存储库中发现隐藏的和不平凡的知识的浓厚兴趣。但是,据我们所知,关于XML挖掘的现有工作都没有考虑到XML文档作为在线信息的动态性质。本文提出了一种新型的频繁模式,即频繁和并发变异子结构(FRACTURE),该模式是从XML文档的演变中提取的。发现的FRACTURE是XML文档的一组子结构,这些子结构经常一起更改。从FRACTURE获得的知识在XML索引,XML聚类等应用程序中很有用。为了保持结果模式简洁明了,我们进一步提出了最大FRACTURE挖掘的问题。设计了两种算法,分别采用逐级策略和分而治之策略,以挖掘一组FRACTURE。第二种算法(效率更高)也经过优化以发现最大断裂集。涉及大量合成和现实数据集的实验证明了开发算法的效率和可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号