首页> 外文会议>2011 Eighth Web Information Systems and Applications Conference >Ef.cient Star Join for Column-oriented Data Store in the MapReduce Environment
【24h】

Ef.cient Star Join for Column-oriented Data Store in the MapReduce Environment

机译:在MapReduce环境中面向列数据存储的高效Star Join

获取原文
获取原文并翻译 | 示例

摘要

MapReduce is a parallel computing paradigm that has gained a lot of attention from both industry and academia recent years. Unlike parallel DBMSs, with MapReduce, it is easier for non-expert to develop scalable parallel programs for analytical applications over huge data sets across clusters of commodity machines. As the nature of scan-oriented processing, the performance of MapReduce for relation operators can be enhanced dramatically since it is inevitably accessing lots of unnecessary data tuples, especially for table join operators. In this paper, we propose an ef.cient star join strategy called HdBmp join for column-oriented data store by using a three-level content aware index (I.e., HdBmp Index). Armed with this index, most of the unnecessary tuples in the join processing can be .ltered out, and consequently result in immense reduction in both communication cost and execution time. Our extensive experimental studies con.rm the ef.ciency, scalability and effectiveness of our new proposed join methods.
机译:MapReduce是一种并行计算范例,近年来受到了业界和学术界的广泛关注。与并行DBMS不同,借助MapReduce,非专家更容易为可扩展的并行程序开发可扩展的并行程序,以用于跨商用机器集群的海量数据集进行分析应用程序。作为面向扫描处理的本质,关系运算符的MapReduce性能可以得到显着提高,因为它不可避免地访问了许多不必要的数据元组,尤其是对于表联接运算符。在本文中,我们通过使用三级内容感知索引(即HdBmp索引)为面向列的数据存储提出了一种有效的星形连接策略,称为HdBmp连接。有了这个索引,就可以过滤掉连接处理中的大多数不必要的元组,从而极大地减少了通信成本和执行时间。我们广泛的实验研究证实了我们提出的新连接方法的效率,可伸缩性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号