首页> 外文会议>International World Wide Web Conference; Edinburgh(GB) >Compressing and Searching XML Data Via Two Zips
【24h】

Compressing and Searching XML Data Via Two Zips

机译:通过两个压缩压缩和搜索XML数据

获取原文
获取原文并翻译 | 示例

摘要

XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML representation of a document is significantly larger than its native state) and the complexity of its search (XML search involves path and content searches on labeled tree structures). We address the basic problems of compression, navigation and searching of XML documents. In particular, we adopt recently proposed theoretical algorithms [11] for succinct tree representations to design and implement a compressed index for XML, called XBzipIn-dex, in which the XML document is maintained in a highly compressed format, and both navigation and searching can be done uncompressing only a tiny fraction of the data. This solution relies on compressing and indexing two arrays derived from the XML data. With detailed experiments we compare this with other compressed XML indexing and searching engines to show that XBzipIndex has compression ratio up to 35% better than the ones achievable by those other tools, and its time performance on some path and content search operations is order of magnitudes faster: few milliseconds over hundreds of MBs of XML files versus tens of seconds, on standard XML data sources.
机译:XML正迅速成为在Web上存储,交换和发布的标准格式,并且已嵌入到应用程序中。处理XML的两个挑战是其大小(文档的XML表示比其原始状态大得多)和其搜索的复杂性(XML搜索涉及对标记的树结构的路径和内容搜索)。我们解决XML文档的压缩,导航和搜索的基本问题。特别是,我们采用最近提出的用于简洁树表示的理论算法[11]来设计和实现XML的压缩索引,称为XBzipIn-dex,其中XML文档以高度压缩的格式维护,并且导航和搜索都可以只需解压缩一小部分数据即可。该解决方案依赖于压缩和索引从XML数据派生的两个数组。通过详细的实验,我们将其与其他压缩的XML索引和搜索引擎进行了比较,以表明XBzipIndex的压缩率比其他工具可达到的压缩率高35%,并且它在某些路径和内容搜索操作上的时间性能约为一个数量级。更快:在数百MB的XML文件中只需几毫秒,而在标准XML数据源上则需要几十秒。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号