首页> 外文会议>2011 Eighth Web Information Systems and Applications Conference >Efficient Star Join for Column-oriented Data Store in the MapReduce Environment
【24h】

Efficient Star Join for Column-oriented Data Store in the MapReduce Environment

机译:MapReduce环境中面向列数据存储的高效星形连接

获取原文

摘要

Map Reduce is a parallel computing paradigm that has gained a lot of attention from both industry and academia recent years. Unlike parallel DBMSs, with Map Reduce, it is easier for non-expert to develop scalable parallel programs for analytical applications over huge data sets across clusters of commodity machines. As the nature of scan-oriented processing, the performance of Map Reduce for relation operators can be enhanced dramatically since it is inevitably accessing lots of unnecessary data tuples, especially for table join operators. In this paper, we propose an efficient star join strategy called HdBmp join for column-oriented data store by using a three-level content aware index (i.e., HdBmp Index). Armed with this index, most of the unnecessary tuples in the join processing can be filtered out, and consequently result in immense reduction in both communication cost and execution time. Our extensive experimental studies confirm the efficiency, scalability and effectiveness of our new proposed join methods.
机译:Map Reduce是一种并行计算范例,近年来受到了业界和学术界的广泛关注。与并行DBMS不同,使用Map Reduce,非专家更容易为可扩展的并行程序开发可扩展的并行程序,以用于跨商用机器集群的巨大数据集进行分析应用程序。作为面向扫描处理的性质,关系还原运算符的Map Reduce性能可以得到显着提高,因为它不可避免地访问了许多不必要的数据元组,尤其是对于表联接运算符。在本文中,我们通过使用三级内容感知索引(即HdBmp索引)为面向列的数据存储提出了一种有效的星形连接策略,称为HdBmp连接。有了这个索引,就可以滤除联接处理中的大多数不必要的元组,从而极大地减少了通信成本和执行时间。我们广泛的实验研究证实了我们提出的新连接方法的效率,可扩展性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号