首页> 外文期刊>Concurrency and Computation >LAF: a new XML encoding and indexing strategy for keyword-based XML search
【24h】

LAF: a new XML encoding and indexing strategy for keyword-based XML search

机译:LAF:一种新的基于关键字的XML搜索的XML编码和索引策略

获取原文
获取原文并翻译 | 示例

摘要

As a large number of corpuses are represented, stored and published in XML format, how to find useful information from XML databases has become an increasingly important issue. Keyword search enables web users to easily access XML data without the need to learn a structured query language or to study complex data schemas. Most existing indexing strategies for XML keyword search are based upon Dewey encoding. In this paper, we proposed a new encoding method called Level Order and Father (LAF) for XML documents. With LAF encoding, we devised a new index structure, called two-layer LAF inverted index, which can greatly decrease the space complexity compared with Dewey encoding-based inverted index. Furthermore, with two-layer LAF inverted index, we proposed a new keyword query algorithm called Algorithm based on Binary Search (ABS) that can quickly find all Smallest Lowest Common Ancestor. We experimentally evaluate two-layer LAF inverted index and ABS algorithm on four real XML data sets selected from Wikipedia. The experimental results prove the advantages of our index method and querying algorithm. The space consumed by two-layer LAF index is less than half of that consumed by Dewey inverted index. Moreover, ABS is about one to two orders of magnitude faster than the classic Stack algorithm.
机译:随着大量语料库以XML格式表示,存储和发布,如何从XML数据库中找到有用的信息已成为越来越重要的问题。关键字搜索使Web用户可以轻松访问XML数据,而无需学习结构化查询语言或研究复杂的数据模式。用于XML关键字搜索的大多数现有索引策略都是基于Dewey编码的。在本文中,我们为XML文档提出了一种新的编码方法,称为“等级顺序和父级(LAF)”。使用LAF编码,我们设计了一种新的索引结构,称为两层LAF反向索引,与基于Dewey编码的反向索引相比,它可以大大降低空间复杂度。此外,通过两层LAF倒排索引,我们提出了一种新的关键字查询算法,称为基于二进制搜索(ABS)的算法,该算法可以快速找到所有最小的最低共同祖先。我们对选自维基百科的四个真实XML数据集实验性地评估了两层LAF倒排索引和ABS算法。实验结果证明了我们的索引方法和查询算法的优势。两层LAF索引占用的空间小于杜威倒排索引占用的空间的一半。此外,ABS比经典Stack算法快约一到两个数量级。

著录项

  • 来源
    《Concurrency and Computation》 |2013年第11期|1604-1621|共18页
  • 作者单位

    Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;

    Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;

    Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    XML keyword search; LAF; two-layer index; ABS; SLCA;

    机译:XML关键字搜索;LAF;两层索引ABS;SLCA;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号