首页> 外文会议>Asia-Pacific Web Conference >A Framework for OLAP in Column-Store Database: One-Pass Join and Pushing the Materialization to the End
【24h】

A Framework for OLAP in Column-Store Database: One-Pass Join and Pushing the Materialization to the End

机译:列在列 - 商店数据库中的OLAP框架:一次性连接并将实现推向终点

获取原文

摘要

In data warehouse modeled with the star schema, data are usually retrieved by performing a join operation between the fact table and dimension table(s) followed by a selection and project operation, while join operator is the most expensive operator in RDBMS. In column-store database, there are two ways to do join. The first way is early materialization join (EM join); the other way is late materialization join (LM join). In EM join, the columns involved in the query are glued together firstly, then the glued rows are sent to join operator. Whereas, in LM join, only the attributes participated in the join operator are accessed. The problem that access to inner table is out-of-order can't be ignored for LM join. Otherwise, the naive LM join is usually slower than EM join [9]. Since the late materialization is good for memory bandwidth and CPU efficiency, the LM join attracts more attention in academic research community. The state-of-art LM joins in column-store such as radix-cluster hash join [8] in MonetDB, invisible join [10] in C-Store all try to avoid accessing table randomly. In this paper, we devised a framework for OLAP called CDDTA-MMDB where a new join algorithm called CDDTA-LWMJoin (we contract it to LWMJoin in the following) is introduced. The LWMJoin is on the basis of our prior work: CDDTA-Join [7]. We equip the CDDTA-Join with light-weight materialization (LWM) which is designed to cut down the memory access and reduce production of intermediate data structure. Experiments show that CDDTA-MMDB is efficient and can be 2x faster than MonetDB and 4x faster than invisible join in the context of data warehouse modeled with star schema.
机译:在与星形模式建模的数据仓库中,通常通过在事实表和维度表之间执行连接操作,然后进行选择和项目操作,而加入运算符是RDBMS中最昂贵的操作员。在列存储数据库中,有两种方法可以加入。第一种方法是早期物化加入(EM加入);另一种方式是晚期物化连接(LM连接)。在EM加入中,查询中涉及的列首先将粘合在一起,然后将胶合行发送到加入运算符。虽然,在LM连接中,只访问参与加入运算符的属性。对于LM加入,访问内表的访问是无序的。否则,天真LM连接通常比EM加入较慢[9]。由于晚期的材料化适用于内存带宽和CPU效率,因此LM加入吸引了学术研究界的更多关注。最先进的LM在列 - 储存中加入MonetDB的Radix-Cluster Hash [8],在C-Store中的“不可见JOIN [10]中,所有都尝试随机避免访问表。在本文中,我们设计了一个名为CDDTA-MMDB的OLAP的框架,其中介绍了一种新的连接算法,称为CDDTA-LWMJOIN(我们将其签订为LWMJOIN)。 LWMJOIN是在我们的事先工作的基础上:CDDTA-JOIN [7]。我们装备了CDDTA-Join,具有轻量级化(LWM),旨在削减内存访问并减少中间数据结构的生产。实验表明,CDDTA-MMDB是高效的,并且可以比MonetDB和4x更快地比Invisible Join更快的4倍,在与Star Schema建模的数据仓库的背景下更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号