首页> 外文会议>International workshop on database and expert systems applications >Distributed Evaluation of XPath Axes Queries over Large XML Documents Stored in MapReduce Clusters
【24h】

Distributed Evaluation of XPath Axes Queries over Large XML Documents Stored in MapReduce Clusters

机译:XPath轴的分布式评估在MapReduce集群中的大型XML文档上查询

获取原文

摘要

The MR (MapReduce) framework, a programming model for parallel computation over data stored in a cluster of commodity computers, established itself as one of the leading solutions for Big Data processing. This framework is also being used like a query language in many database systems, because it can process data stored in various unstructured, semi-structured, and structured formats. Nevertheless, the MR framework can be used for XML data processing too, it does not allow to write queries in a declarative manner, like XPath or XQuery. To overcome this problem, we propose a system that enables to query XML data with XPath, but it evaluates the queries in parallel using the MR framework. First, we introduce a persistent storage that maps XML data into a wide-column store. The proposed mapping enables efficient and distributed data processing. Secondly, we describe a query processor translating an XPath language subset to MR jobs. Finally, we present tests and their results showing the scalability of our system.
机译:MR(MAPREDUCE)框架,用于通过存储在商品计算机集群中的数据的并行计算的编程模型,建立了本身作为大数据处理的领先解决方案之一。此框架也像许多数据库系统中的查询语言一样使用,因为它可以处理以各种非结构化,半结构化和结构化格式存储的数据。尽管如此,MR框架也可以用于XML数据处理,它不允许以声明方式编写查询,如XPath或XQuery。为了克服这个问题,我们提出了一个系统,该系统可以使用XPath查询XML数据,但它使用MR框架并行评估查询。首先,我们介绍一个持久存储,将XML数据映射到宽列商店中。所提出的映射可实现高效和分布的数据处理。其次,我们描述了一个查询处理器将XPath语言子集转换为MR作业。最后,我们呈现测试及其结果表明系统的可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号