首页> 外文期刊>Software >Fast In-memory XPath search using compressed indexes
【24h】

Fast In-memory XPath search using compressed indexes

机译:使用压缩索引的快速内存XPath搜索

获取原文
获取原文并翻译 | 示例

摘要

Extensible Markup Language (XML) documents consist of text data plus structured data (markup). XPath allows to query both text and structure. Evaluating such hybrid queries is challenging. We present a system for in-memory evaluation of XPath search queries, that is, queries with text and structure predicates, yet without advanced features such as backward axes, arithmetics, and joins. We show that for this query fragment, which contains Forward Core XPath, our system, dubbed Succinct XML Self-Index ('SXSI'), outperforms existing systems by 1-3 orders of magnitude. SXSI is based on state-of-the-art indexes for text and structure data. It combines two novelties. On one hand, it represents the XML data in a compact indexed form, which allows it to handle larger collections in main memory while supporting powerful search and navigation operations over the text and the structure. On the other hand, it features an execution engine that uses tree automata and cleverly chooses evaluation orders that leverage the speeds of the respective indexes. SXSI is modular and allows seamless replacement of its indexes. This is demonstrated through experiments with (1) a text index specialized for search of bio sequences, and (2) a word-based text index specialized for natural language search.
机译:可扩展标记语言(XML)文档由文本数据加上结构化数据(标记)组成。 XPath允许查询文本和结构。评估此类混合查询具有挑战性。我们提出了一种XPath搜索查询的内存评估系统,即具有文本和结构谓词的查询,但没有诸如后向轴,算术和联接之类的高级功能。我们显示,对于包含Forward Core XPath的该查询片段,我们的系统称为Succinct XML Self-Index('SXSI'),其性能比现有系统高1-3个数量级。 SXSI基于用于文本和结构数据的最新索引。它结合了两个新颖性。一方面,它以紧凑的索引形式表示XML数据,这使其可以处理主存储器中的较大集合,同时支持对文本和结构的强大搜索和导航操作。另一方面,它具有一个执行引擎,该引擎使用树自动机并巧妙地选择利用各个索引速度的评估顺序。 SXSI是模块化的,可以无缝替换其索引。通过(1)专用于搜索生物序列的文本索引和(2)专用于自然语言搜索的基于单词的文本索引的实验证明了这一点。

著录项

  • 来源
    《Software》 |2015年第3期|399-434|共36页
  • 作者单位

    Departamento de Informatica, Universidad Tecnica Federico Santa Maria, Chile;

    Escuela de Informatica y Telecomunicaciones, Universidad Diego Portales, Chile;

    School of Informatics, University of Edinburgh, UK;

    HIIT and Department of Computer Science, University of Helsinki, Finland;

    Department of Computer Science, University of Chile, Chile;

    LRI, Universite Paris-Sud, France;

    Department of Computer Science, University of Chile, Chile;

    Department of Medical Genetics, Faculty of Medicine, University of Helsinki, Finland;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    XML; succinct data structures; XPath; tree automata;

    机译:XML;简洁的数据结构;XPath;树自动机;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号