首页> 外文会议>International Conference on Database Systems for Advanced Applications >Efficient Evaluation of Partial Match Queries for XML Documents Using Information Retrieval Techniques
【24h】

Efficient Evaluation of Partial Match Queries for XML Documents Using Information Retrieval Techniques

机译:使用信息检索技术有效地评估XML文档的部分匹配查询

获取原文

摘要

We propose XIR, a novel method for processing partial match queries on heterogeneous XML documents using information retrieval (IR) techniques. A partial match query is defined as the one having the descendent-or-self axis "//" in its path expression. In its general form, a partial match query has branch predicates forming branching paths. The objective of XIR is to efficiently support this type of queries for large-scale documents of heterogeneous schemas. XIR has its basis on the conventional schema-level methods using relational tables and significantly improves their efficiency using two techniques: an inverted index technique and a novel prefix match join. The former indexes the labels in label paths as keywords in texts, and allows for finding the label paths matching the queries more efficiently than string match used in the conventional methods. The latter supports branching path expressions, and allows for finding the result nodes more efficiently than containment joins used in the conventional methods. We compare the efficiency of XIR with those of XRel and XParent using XML documents crawled from the Internet. The results show that XIR is more efficient than both XRel and XParent by several orders of magnitude for linear path expressions, and by several factors for branching path expressions.
机译:我们提出了一种使用信息检索(IR)技术在异构XML文档中处理部分匹配查询的新方法。部分匹配查询被定义为在其路径表达式中具有下划线或自轴“//”的匹配。在其一般形式中,部分匹配查询具有形成分支路径的分支谓词。 XIR的目标是有效地支持这种类型的异构模式的Qual-Scalics案件的查询。 XIR使用关系表的传统架构级方法是基础,并使用两种技术显着提高其效率:倒指数技术和新型前缀匹配连接。前者将标签索引在标签路径中为文本中的关键字,并且允许在传统方法中使用的字符串匹配找到与查询匹配的标签路径。后者支持分支路径表达式,并且允许比传统方法中使用的容纳连接更有效地找到结果节点。使用XML文档从Internet捕获的XML文档,可以将XREL和XParent的效率进行比较。结果表明,XIR比Xrel和Xparent都比线性路径表达的数量级和Xparent更效率,以及分支路径表达式的几个因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号